Q4 2019 has seen the introduction of new key functionality in the Intent HQ platform, enabling new ways of triggering data processing and the creation of customer profiles. For the first time, Intent HQ’s platform can execute scripts of python, as well as enable running pipelines exclusively for subsets (samples) of the entire dataset.
Running Python Scripts as a part of the Data Processing Pipeline
The IHQ Platform uses a set of configuration to define steps of data processing and enrichment during a pipeline run.
Popular due to its versatile nature, Python is popular with users such as data scientists, data engineers etc. to generate new value by combining information from multiple data sources. Additionally, users can utilise the Intent HQ DSL to help Python reference data sources and process information efficiently.
Enabling Pipeline Runs for a Fixed Sample/Subset of Profiles
In most scenarios, the effort of testing and validating configuration and/or feature engineering on a production-sized dataset is not only expensive and time consuming, but also highly inefficient at scale.
In this release, we have added the capability to run a pipeline for a chosen set of customer profiles. E.g. this may be a subset of users eligible for a sample to be further used as test/validation datasets during feature engineering.
Whether to validate configuration or enrichments (in DSL or Python) on a smaller sample first before going to production, or with the intention of using the subset for further work, we encourage the use of sampling for data exploration given the associated cost and effort of using full datasets.
Launching Pipelines Instantly when New Data is Ingested
In order to increase the level of ease and automation in data transformation, Intent HQ has added functionality to trigger data processing (pipelines) after file ingestion has been completed.
This means the platform will automatically recognise and reference a given pipeline configuration as soon as ‘expected’ data has been received and ingested and will continue to generate enriched profiles.