Connecting to Mongo DB using Apache Drill in ETL Validator

Over the last few years, we have been working with a number of customers across various industries such as Life Sciences, Financial Services, Higher Education etc.  While the problems related to data testing are similar across industries and JSON and flat files are very common, there are many differences in the file formats.  In few industries, Avro is popular while in others, based on use case, Parquet is more commonly used.

 

At Datagaps, one of the key challenges is our ability to empower customers to test any of the formats with ease. Of course, we have an option to build native connections for each format but that may not be the best option sometimes.  At Datagaps, we try to leverage open standards, open source frameworks to support our customers. This is where Apache Drill (https://drill.apache.org/) comes into play.

 

Over the last 6 months, we have been drilling around a bit and absolutely love the speed and flexibility that Apache Drill provides. As of version 3.4.5, we use Drill as the interface between ETL Validator and any of the file formats mentioned above (except flat files since flat files are much more common and deserve native connectors). In this blog, I wanted to take few minutes and explain how easy it is for you to get started with Drill and integrate with ETL Validator. 

 

read more..

ETL Validator for testing Transactional Systems

Most of the use cases that we encounter are for DW scenarios. However, recently, we are also seeing many scenarios for testing transactional systems. This blog is to highlight on one such example.

 

read more..

Database Testing - Ensure acceptable variance in data differences

At a recent event, one of the prospects at the attendees came over to our booth and asked me to demonstrate a use case that was kind of interesting. I got a similar request from another prospect recently and thought that it might be a good idea to blog and show how it can be done using ETL Validator.

 

Problem: There is a source table and a target table. The attendee wanted to find the difference in a number field between two tables. In addition,  he was also interested in specifying an acceptable variance and define a rule on that. If the difference is within the limits, then, he wanted the test case to be marked as success. If the difference exceeds the variance, then, he wanted the test case to be marked as failure. In just few minutes, we were able to demonstrate this use case using 3.4 version of ETL Validator.

read more..