We live in an era of data overload. The volume, rate, and variety of data types and formats, coupled with the speed with which it can be accessed and searched, put incredible amounts of useful information at our fingertips. Data sets are growing in size, in part, because of the wide array of information-collection devices, such as mobile phones, software logs, cameras, radio-frequency identification (RFID) readers, and wireless sensor networks.
Using this information in a business context is a challenge. We typically break the problem down into three phases.
- Developing the questions. Significant thought must be invested to arrive at the right questions on big data projects. Because big data questions haven’t been contemplated, it is usually quite difficult to generate new and creative hypotheses.
- Identifying data sources and creating the technical infrastructures to access them. Extensive research usually is required to identify relevant data sources. We often find valuable data in the most surprising and unexpected places. However, great care must be taken because there may be a lack of understanding about how the raw data was collected and what data-cleansing techniques have already been applied to the data sets. It is imperative to understand how the data was collected and filtered to determine its relevancy to answering the primary questions. In addition, accessing the data may require an understanding of local data privacy and confidentiality regulations.
- Choosing the statistical and modeling techniques to analyze the data. Significant insight must be applied to selecting the analysis techniques, choosing the appropriate data subsets, handling outliers, and addressing other data quality issues.
Iknow can provide assistance across all phases of big data projects. We help our clients address data capture, ingestion, curation, search, storage, and transfer. We use sophisticated software tools that can handle and analyze very large and complex data sets. We use a wide range of analytical approaches, including data mining, text mining, sentiment analysis, predictive analytics, and visualization. We also understand and handle governance issues regarding information privacy.
Iknow's deliverables can include:
- Big data strategy
- Data source identification, procurement, and integration
- Big data pilots and prototypes
- Analysis of complex data sets.
Careful analysis of large data sets can uncover new and unexpected correlations. Better decisions based on these insights can yield greater operational efficiencies, better resource allocation, lower costs, and reduced risk.