What is U-SQL? DevToolsGuy / Thu, Jan 7, 2016 Microsoft has taken another step towards making analysis of big data easier with the introduction of their U-SQL language, the new query language designed to run on the Azure Data Lake Store. Announced...
5 trends in Big Data DevToolsGuy / Fri, Nov 20, 2015 “Big data” is the common term for the exponential growth and availability of data, both structured and unstructured. Referring to it as ‘big’ data is perhaps somewhat of an understatement – IBM estimated...
Aspects of Datasets - Part 2 Tim Brock / Fri, Jul 31, 2015 This is the second (and final) article looking at key aspects of datasets. Having previously covered relevance, accuracy, and precision, here we will consider consistency, completeness and size. Consistency...
Visual Explorations of Sample Size Tim Brock / Mon, May 25, 2015 Drawing conclusion based on small samples is obviously problematic. At the same time, I also wonder whether the rise to prominence of "Big Data" can lead organisations to blindly collect as much data...
Too Big Data: Coping with Overplotting Tim Brock / Mon, Apr 20, 2015 Scatter plots are a wonderful way of showing ( apparent ) relationships in bivariate data. Patterns and clusters that you wouldn't see in a huge block of data in a table can become instantly visible on...
What is A PetaByte? DevToolsGuy / Mon, Oct 22, 2012 According to Wikipedia : “A petabyte (derived from the SI prefix peta- ) is a unit of information equal to one quadrillion (short scale) bytes, or 1 billiard (long scale) bytes. The unit symbol...