Py datanyc2015
-
Upload
rosettahub -
Category
Data & Analytics
-
view
195 -
download
1
Transcript of Py datanyc2015
PyData NYC 2015November 10th 2015
Karim Chine [email protected]
Towards a universal platform for
data science on
public and private clouds
2
A universal open platformfor data science
Computational Components R packages, Wrapped C,C++,Fortran code, Python modules, Matlab Toolkits… Open source or commercial
Computational Resources
Clusters, grids, private or public cloudsFree or pay-per-use
Computational GUIsHTML5 and Desktop WorkbenchBuilt-in views /Plugins /Collaborative viewsOpen source or commercial
Computational Scripts R / Python / Matlab / Groovy
Computational APIs Java / SOAP / REST, Stateless and stateful
Computational StorageLocal, NFS, FTP, Amazon S3, EBS
Generated Computational Web ServicesStateful or stateless, mapping of R objects/functions
Elastic-R
3
Infrastructures federation: rosetta virtual cloud
Public Clouds
Private Cloud
44
AWS: programmable infrastructure
Command Line
Web Console
SDK
API
55
Command Line
Web Console
SDK
API
rosettaHUB: programming with data and infrastructure
6
Google Docs-like real time collaboration
7
Traceable and Reproducible data science
Elastic-R AMI 1R 2.10
BioC 2.5
Elastic-R AMI 2R 2.9
BioC 2.3
Elastic-R AMI 3R 2.8
BioC 2.0
Elastic-R Amazon Machine Images
Elastic-R EBS 1
Data Set XXX
Elastic-R EBS 2
Data Set YYY
Elastic-R EBS 3
Data Set ZZZ
Elastic-R EBS 4
Data Set VVV
Elastic-R AMI 2
R 2.9BioC 2.3
Elastic-R EBS 4
Data Set VVV
Amazon Elastic Block Stores
Eastic-R AMI 2R 2.9
BioC 2.3
Elastic-R.org
Elastic-R EBS 4
Data Set VVV
8
Architecture
9
Architecture
10
Data science universal engine Remote Java/R
Processes Events-driven Remote
Objects/Engines R, Python, Mathematica,
Matlab, Scilab, ... Collaborative Spreadsheets Collaborative Scientific
Graphics Canvas Collaborative Dashboard with
collaborative widgets
11
www.rosettahub.com