Data Analysis and R-Node
Jamie Love
N-Squared Software
Introduction
- Who am I
- Data and Analysis. Why this talk?
- Data is gathered, and provided, in ever-increasing amounts.
- The Internet makes accessing data easier.
- Analysis is important to understand and reason about data.
- Open Source software can help with this.
- R-Node is (will be) about making one Open Source tool more accessible.
Data Analysis
- Massive Industry
- Data Warehousing
- Software Agents
- Statistics and Data Mining
-
SPSS, SigmaPlot (the big boys)
-
Open Source - R (+ UI, e.g. R Commander)
- Business Intelligence (BI)
- Other
Data Visualisation
- One aspect
- Edward Tufte - The Visual Display of Quantitative Information.
- Junkcharts - A blog about good chart design.
- Chart Porn - A blog about (content-light, but pretty) infographics.
Aside: Open Data
- Open APIs
- Programmable Web (Google Maps, twitter, Flickr).
- Open government data initiatives (e.g. NZ).
- Limited access to raw data/analysis code from Scientists. Call for a change
The Crux: Data Visualisation
- Basic Libraries
- Data Analysis
R-Node
- A web frontend to R. (http://www.squirelove.net/r-node).
- Why?
- Because I needed something to do in the evening after my daughter went to bed.
- Really, why?
- Because I wanted a way to leverage the power of R in their web-based systems.
- demo
- Future
- There is a need for R to be accessible as a service to users
- I will focus on R-Node becomming a strong web-based UI for a central R service.