For Finextra's free daily newsletter, breaking news and flashes and weekly job board.
Numbers are more persuasive; visual representations gravitate bringing the ‘wow’ factor. All this is possible by the sophisticated tools of today that churn large amounts of data into nice beautiful picturesque dashboards. Some look like abstract art that
would have certainly thrilled Picasso. The CxOs use these to make ‘fact based’ business decisions. I met a CxO, whose screen saver is a series of dashboards and she mentioned with pride that she has her fingers on the pulse of the business and the executives
run business on quantitative facts. Very impressive. It did not take long for the company to run into trouble. The primary reason being the attractive dashboards showed numbers that are not truly representative. What I am driving at is the user of analytics mandatorily
needs to have a full understanding of the underlying algorithm that shows the numbers. This is a life saver and it requires shredding the masterpiece of the dashboard and poke ones’ head a bit into the unknown. Data in itself is an asset; digital gold. The
problem lies in the packaging.
I list out the 5 greatest ‘devils’ as a result of poor packaging that can make analytics a punching back to relieve stress.
When an analytics project is under consideration, it is necessary to have a project team conversant with the business and with sound knowledge of principles of statistics. Secondly, the data source that is not included in the project must be ring fenced
and the risk assessed from such exclusion. Thirdly, as the project progresses each of the algorithms that is coded must be reviewed, tested and ‘dry run’ before moving into production. Finally the biggest devil, the documentation must be comprehensive and
have generous examples (use cases if you will) in the use of data and clearly describe the outcome.
The one reality for any flavor of analytics is a necessary reliance on historical data; it does not matter if the analytics is descriptive or predictive or prescriptive. Make sure it is accurate.
Now to the cliche; Data is the new oil. Oil in itself cannot run an automobile unless processed in the right way.
Re. "Make sure the size of a sample has some relation to the size of data that is interpreted." This is old wisdom that we've known for a long time.
However, lately, I find many sample sizes of ~2000 - regardless of the size of universe. When I looked around, I read in some random place that
sample size beyond 2000 does not increase the confidence level
of the results, however large the population beyond 20000.
This seems to contradict conventional wisdom. Keen to know your views on whether this is yet another example of "how to lie with statistics" or the outcome of some drastic advancement in the field of statistics that totally upends old wisdom about sample
I am suggesting two different attributes of a sample. The first being the sample has to represent the population. For accuracy two separate random samples may help. The second is the sample size itself. The larger the better. 95% confidence is good to draw
I know but the source I referenced somewhat contradicts your second assertion "the larger the better". I wonder if the resolution of the conundrum lies in the proposition that it's extremely hard to create a representative sample of just 2000 if the population
size is high i.e. makes it impossible to fulfill your first attribute.
04 Jan 2018
This post is from a series of posts in the group:
A community blog about data and how to manage it