If you have ever spent hours (or days) working on a data analysis project, stepped away for a while, come back to the PC and then really struggled to restart, you might just have a documentation issue.
Documentation of what you do on a project---and how you do it---is good practice, regardless of project size or scope. Aside from having continuity in your daily work, other benefits of good documentation are repeatability and clarity for the client.
Recently I began working with a new client, initially on a pro bono basis (not my normal model of work but I felt led to reach out to the company, and so I did). In my first task, I spent a few days analyzing and cleaining data using Excel.
Excel has always been my go-to analysis tool, but I also have used Tableau and Microsoft BI for data visualization. I have been trained in some statistical applications (use it or lose it!), and in more recent years, I have learned to use R and Python in academic settings.
Typically, I work at a fast pace to drive a project to its end state. I enjoy delivering a final product that is at, or above, the level that a client expects. I have found that documentation is a client expectation, even when it is not explicitly stated in an RFP. After a few early career project experiences of losing track of the details, I am now careful to note what I do and how I do it. A third aspect that is equally important: why I choose a particular method.
A Variety of Tools to Choose From
Excel is my preferred tool for analysis however, it is not the best to document the process followed. In Excel, my notes are placed on a summary sheet tab within the workbook with links to a referenced table or chart, and/or in a separate file (Word or PowerPoint). Although I have used Python (Jupyter notebook) much less than Excel, I have observed that it is superior to Excel in this respect.
As I enter the next phase of working with the aforementioned client, I am finding that in addition to being an equally (or more) powerful analysis tool, Python's user interface is designed for documenting analysis proceses steps, and writing and running the corresponding code. As such, repeatability is designed into the application. All data analysts and data scientists would agree that making a project repeatable is an important requirement.
I have a few consulting projects in the pipeline for the remainder of this year. Whether Excel, or Python, or another application will be used, I will remain flexible and open to learning new applications. The hammer is not my only tool, but I definitely want to nail every job!