Data and analytics needs are evolving at the same speed as tools are being built for those very needs. The sheer amount of technology available today can make it very difficult for any data leader to know what tools to use and how to leverage them to maximum benefit. However, the rickety connectivity between disparate tools doesn’t make the data infrastructure work; Instead, looking at the broader trends and successful strategies applied in the data and analytics space can revolutionize the way your organization gets data to the people who need it most.
Cameron O’Rourke, Senior Director of Product Strategy at Incorta, and Eldad Chai, CEO of Satori, meet for a meeting. DBTA Webinar, Top Trends in Data Engineering, to discuss key patterns and approaches that are lighting up areas of data and analytics that need some technology TLC.
O’Rourke analyzed the routines of many organizations’ approaches to data engineering, as well as their shortcomings. Although the “modern” data pipeline has promise, it ultimately moves data through many stages – reducing its quantity and quality and increasing the time it takes to reach its goal. From the “initial area” it lands in, to optimization, and finally to its final state leads to data loss, stale data, and a lack of precision and flexibility.
Agile data lakehouse is a strategy that embraces decentralization and scale; Or, as O’Rourke put it, it means doing more with less.
“We want to move the data in its original form and keep that original form, and then apply different use cases and workloads to the data, essentially as it is. That means different analytics issues and different domains have access to information the way they’re used to seeing it,” O’Rourke said.
This method provides data scientists with a large amount of raw data, as well as increasing accessibility through an open store that users can access directly. O’Rourke conceded that this process was anything but trivial; How does it even work?
O’Rourke explained that the key toward improving data engineering is to allow the most advanced and capable query engines available today to do more of the work, reducing the upfront data engineering work. While common engineering operations typically consist of flattening, transforming, and aggregating data to be processed by a query engine, data engineers are now leveraging analytics platforms and fetching data directly from source systems. Leaving data in its original form allows users to harvest insights about data instantly without waiting for it to be prepared, increasing speed, freshness, and accuracy.
O’Rourke concluded that breaking free from the traditional pipeline and offering greater simplicity, agility, and speed are the critical responses needed to enable modern and efficient data architecture.
Chai’s analysis of data engineering trends focused on Satori’s survey of more than 300 data leaders in data engineering, architecture, business intelligence, analytics, and data science, across many different industries and organization sizes. Satori inquired about common data challenges including data sharing and comprehensive roadmaps for continued benefit, as well as how this stacks up with the realities of their organization in terms of people, technology, and processes.
The survey revealed a slew of statistics that highlight the biggest concerns of data workers and their workloads. While 75% of companies plan to increase data usage – a seemingly positive prediction – more data and more people using it raises some alarms; How will operations be expanded? What will break? How can you make data easy to prepare, digest, and access?
To answer these questions, Chai pointed viewers toward a big trend: 61% of companies have manual or siled approaches to enabling data access. If you’re wondering why this happens, Chai provided the stats: 75% of companies must deal with sensitive or structured data, and 61% of data leaders spend more than 10% of their time managing access and security. This accumulation of redundancy and manual labor is a definite disservice towards data access, as organization and scale have become the bogeyman for efficient and resilient data architecture.
Fortunately, Chai pointed to a promising data point: The 20% of teams with automated access to data have significantly improved their data operations and have been able to get data to the people who need it faster. The shift towards dynamic, real-time access, automation, self-service, and baked-in security is fundamental towards enabling accessible data. Chai concluded that these strategies can be implemented through Satori’s solution. An identity and data-aware cloud service that dynamically manages access to data across all data warehouses.
For more information on data engineering trends and to view Satori’s full report, you can view an archived copy of the webinar here.