大数据资源整理(5)

2019-08-26 17:48

IPython：为交互式计算提供丰富的架构; Kibana：可视化日志和时间标记数据; Matplotlib：Python绘图;

Metricsgraphic.js：建立在D3之上的库，针对时间序列数据进行最优化; NVD3：d3.js的图表组件;

Peity：渐进式SVG条形图，折线和饼图;

Plot.ly：易于使用的Web服务，它允许快速创建从热图到直方图等复杂的图表，使用图表Plotly的在线电子表格上传数据进行创建和设计; Plotly.js：支持plotly的开源JavaScript图形库;

Recline：简单但功能强大的库，纯粹利用JavaScript和HTML构建数据应用; Redash：查询和可视化数据的开源平台; Shiny：针对R的Web应用程序框架; Sigma.js：JavaScript库，专门用于图形绘制; Vega：一个可视化语法;

Zeppelin：一个笔记本式的协作数据分析; Zing Charts：用于大数据的JavaScript图表库。

物联网和传感器

TempoIQ：基于云的传感器分析; 2lemetry：物联网平台; Pubnub：数据流网络;

ThingWorx：ThingWorx 是让企业快速创建和运行互联应用程序平台;

IFTTT：IFTTT 是一个被称为 “网络自动化神器” 的创新型互联网服务，它的全称是 If this then that，意思是“如果这样，那么就那样”;

Evrythng：Evrythng则是一款真正意义上的大众物联网平台，使得身边的很多产品变得智能化。

文章推荐

NoSQL Comparison(NoSQL 比较)- Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase vs Couchbase vs Neo4j vs Hypertable vs ElasticSearch vs Accumulo vs VoltDB vs Scalaris comparison;

Big Data Benchmark(大数据基准)- Redshift, Hive, Shark, Impala and Stiger/Tez的基准;

The big data successor of the spreadsheet(电子表格的大数据继承者) – 电子表格的继承者应该是大数据。

论文

2015 – 2016

2015 – Facebook – One Trillion Edges: Graph Processing at Facebook-Scale.(一兆边：Facebook规模的图像处理) 2013 – 2014

2014 – Stanford – Mining of Massive Datasets.(海量数据集挖掘)

2013 – AMPLab – Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices. (Presto：稀疏矩阵的分布式机器学习和图像处理)

2013 – AMPLab – MLbase: A Distributed Machine-learning System. (MLbase：分布式机器学习系统)

2013 – AMPLab – Shark: SQL and Rich Analytics at Scale. (Shark: 大规模的SQL 和丰富的分析)

2013 – AMPLab – GraphX: A Resilient Distributed Graph System on Spark. (GraphX:基于Spark的弹性分布式图计算系统)

2013 – Google – HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardinality Estimation Algorithm. (HyperLogLog实践:一个艺术形态的基数估算算法)

2013 – Microsoft – Scalable Progressive Analytics on Big Data in the Cloud.(云端大数据的可扩展性渐进分析)

2013 – Metamarkets – Druid: A Real-time Analytical Data Store. (Druid：实时分析数据存储)

2013 – Google – Online, Asynchronous Schema Change in F1.(F1中在线、异步模式的转变)

2013 – Google – F1: A Distributed SQL Database That Scales. (F1: 分布式SQL数据库)

2013 – Google – MillWheel: Fault-Tolerant Stream Processing at Internet Scale.(MillWheel: 互联网规模下的容错流处理)

2013 – Facebook – Scuba: Diving into Data at Facebook. (Scuba: 深入Facebook的数据世界)

2013 – Facebook – Unicorn: A System for Searching the Social Graph. (Unicorn: 一种搜索社交图的系统)

2013 – Facebook – Scaling Memcache at Facebook. (Facebook 对 Memcache 伸缩性的增强) 2011 – 2012

2012 – Twitter – The Unified Logging Infrastructure for Data Analytics at Twitter. (Twitter数据分析的统一日志基础结构)

2012 – AMPLab –Blink and It’s Done: Interactive Queries on Very Large Data. (Blink及其完成：超大规模数据的交互式查询)

2012 – AMPLab –Fast and Interactive Analytics over Hadoop Data with Spark. (Spark上 Hadoop数据的快速交互式分析)

2012 – AMPLab –Shark: Fast Data Analysis Using Coarse-grained Distributed Memory. (Shark：使用粗粒度的分布式内存快速数据分析)

2012 – Microsoft –Paxos Replicated State Machines as the Basis of a High-Performance Data Store. (Paxos的复制状态机——高性能数据存储的基础) 2012 – Microsoft –Paxos Made Parallel. (Paxos算法实现并行)

2012 – AMPLab – BlinkDB：BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data.(超大规模数据中有限误差与有界响应时间的查询) 2012 – Google –Processing a trillion cells per mouse click.(每次点击处理一兆个单元格)

2012 – Google –Spanner: Google’s Globally-Distributed Database.(Spanner：谷歌的全球分布式数据库)

2011 – AMPLab –Scarlett: Coping with Skewed Popularity Content in MapReduce Clusters.(Scarlett：应对MapReduce集群中的偏向性内容)

2011 – AMPLab –Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center.(Mesos：数据中心中细粒度资源共享的平台)

2011 – Google –Megastore: Providing Scalable, Highly Available Storage for Interactive Services.(Megastore：为交互式服务提供可扩展，高度可用的存储) 2001 – 2010

2010 – Facebook – Finding a needle in Haystack: Facebook’s photo storage.(探究Haystack中的细微之处： Facebook图片存储)

2010 – AMPLab – Spark: Cluster Computing with Working Sets.(Spark:工作组上的集群计算)

2010 – Google – Storage Architecture and Challenges.(存储架构与挑战)

2010 – Google – Pregel: A System for Large-Scale Graph Processing.(Pregel: 一种大型图形处理系统)

2010 – Google – Large-scale Incremental Processing Using Distributed

Transactions and Noti?cations base of Percolator and Caffeine.(使用基于Percolator 和 Caffeine平台分布式事务和通知的大规模增量处理)

2010 – Google – Dremel: Interactive Analysis of Web-Scale Datasets.(Dremel: Web规模数据集的交互分析)

2010 – Yahoo – S4: Distributed Stream Computing Platform.(S4:分布式流计算平台) 2009 – HadoopDB：An Architectural Hybrid of MapReduce and DBMS

Technologies for Analytical Workloads.(混合MapReduce和DBMS技术用于分析工作负载的的架构)超人学院

2008 – AMPLab – Chukwa: A large-scale monitoring system.(Chukwa: 大型监控系统)

2007 – Amazon – Dynamo: Amazon’s Highly Available Key-value Store.(Dynamo: 亚马逊的高可用的关键价值存储)

2006 – Google – The Chubby lock service for loosely-coupled distributed systems.(面向松散耦合的分布式系统的锁服务)

2006 – Google – Bigtable: A Distributed Storage System for Structured Data.(Bigtable: 结构化数据的分布式存储系统)

2004 – Google – MapReduce: Simplied Data Processing on Large Clusters.(MapReduce: 大型集群上简化数据处理)

2003 – Google – The Google File System.(谷歌文件系统)

视频

数据可视化数据可视化之美

Noah Iliinsky的数据可视化设计

Hans Rosling’s 200 Countries, 200 Years, 4 Minutes

共6页:

大数据资源整理(5).doc 将本文的Word文档下载到电脑下载失败或者文档不完整，请联系客服人员解决！

下载这篇word文档