Before we start, I think it's very important to define what a data warehouse is. The clinical definition, courtesy of Wikipedia, is that a data warehouse, sometimes also called an enterprise data warehouse, is "a system used for reporting and data analysis, and is considered a core component of business intelligence. DWs are central repositories of integrated data from one or more disparate sources.'
Does that make sense? If it doesn't totally snap together in your head, that's OK. All you really need to know is that a data warehouse is a central repository for all of the data used to pull reports and analytics for your enterprise. Now you know what business segment will take up most of a data warehouse analyst's time and attention.
One of the core duties of data warehouse analyst is to develop a system for gathering together and documenting all of the business requirements for the enterprise, so in that sense they will function as a business analyst. The analyst will also design and execute an innovative reporting system for processes, and monitor efficient transition of business requirements to technical specifications.
These analysis tasks are inherent in the job function of a data warehouse analyst, but they can also be gleamed from a business analyst function. A successful DW analyst will also need to have a deep understanding of data and data analysis. You will assist departments to prepare data mining processes and develop designs for metadata.
A DW analyst will also be expected install all SQL Server software and perform upgrades, as well as backups. This will be database administrator work. If it's not clear already, data warehouse analyst is multifunction role that requires the wearing of a lot of hats, as well as the keeping of a lot of balls in the air.
Finally, a DW analyst should expect to participate in various project meetings and prepare applications for the database. This job function really revolves around project management skills.
Change in this employment niche are being largely driven 'big data,' AI, and machine learning. As the databases get larger and have to be faster, the technologies change. Rather than sticking with a classic SQL, large databases are transitioning to a NoSQL format — and that is a shift I am sure will not revert back.
The language to address these NoSQL structure is Python, and while understanding the ins and outs of Python isn't always listed as a requirement in data warehouse analyst job postings, it is a growing necessity. Knowledge of Python will make any aspiring DW analyst a valuable commodity.
Python was created at the tail end of the 1980s, but remains an incredibly important programming language used in countless computer applications today. In addition, applications that do not utilize Python often require interpretation, so that they may work in tandem with those programs that do. In the end, Python is a valuable specialty asset to have in any data science-related and/or -driven career.
Privacy is another area where a successful data warehouse analyst must run to keep up. I honestly think most people are not going to sit still for an open-ended, in-broad-daylight assault of their privacy — so widespread data harvesting will be more subtle and subversive
I think data warehouse analysts will see large-scale video surveillance come to fruition and that is just one area where on-device AI is enabling a fundamental shift in the industry. IBM is the organization that created Watson, the world-renowned artificial intelligence computer system used in weather prediction, computer academia, and even presented as a contestant on the popular game show Jeopardy.
Directly from IBM, machine learning is described as 'a form of AI that enables a system to learn from data rather than through explicit programming.' While the implications of this in itself are huge and worthy of extensive literature, one can certainly surmise, just for the purposes of this article, the important connections between data science and machine learning.
Other industries, from fast food to oil and gas, are also seeing the benefits of deploying on-device AI. All of this functionality is going to have to be maintained and data warehouse analysts will be the ones taking care of those maintenance and repair issues.
Job preparation (and certification)
So you've decided to pursue this role. What do you need to know to succeed as a data warehouse analyst? As for training and background, you will need a very solid business degree or math background to get started. Once you have that you can start on the other pieces of the job description.
Remember I mentioned that a business analyst skill set is required for some of your job duties as a data warehouse analyst? Well, why not pick up a certification for that function? PMI offers a great certification for business analysts — yes, I know DW analysts are not specifically in that role. You will be doing a lot of business analysis, however, and a solid cert in your pocket will only help.
The data side of your job will require a good math and analytical background, so you could pick up a really nice cert for big data. I recommend the Cloudera Certified Associate (CCA) Data Analyst. This will prep you for the NoSQL and big data items you will need. It is focused on using Data Definition Language (DDL) statements to create or alter structures in the metastore for use by Hive and Impala
Last but not least, you are going to have to interact and work with other people — maybe even run a project. You know what I'm going to recommend next. Yep, get the Project Management Professional (PMP) from PMI. Even if 'project manager' isn't part of your title, the PMP credential applies to anyone who helps bring a project to completion. Just like a CPA validates expertise for accountants, the PMP recognizes your ability to manage projects — and the hard work you've done so far.
If you are going to be a data warehouse analyst, my belief is that you will need a full gamut of skills to make yourself a well rounded employee. Companies don't want a siloed individual. They want a person who has analytics, analysis, project management ... all that on top of the database and cloud technology knowledge. If you go down this path, or you are already in the job function, then I wish you great success.