
(Russian version is here)
General
Why Spark is the Technology That’s Changing Silicon Valley
- Get Paid To Apply Machine Learning: The Ladder Approach To Becoming a Machine Learning Consultant
- 8 Reasons Apache Spark is So Hot
- 3 Important Technologies That Will Change the Internet - in a few years, the Internet could look like a very different place than it is today. We’ll be able to complete searches by voice, more efficiently search for images by the people or objects they contain, or use video analysis tools to see videos or ads that are more relevant to the content we’re viewing.
- The 3 Key Steps to Building a Predictive App with Machine Learning
- How PayPal uses deep learning and detective work to fight fraud
- IBM acquires deep learning startup AlchemyAPI - so much for AlchemyAPI CEO Elliot Turner’s statement that his company is not for sale. IBM has bought the Denver-based deep learning startup that delivers a wide variety of text analysis and image recognition capabilities via API.
- LIBFFM: A Library for Field-aware Factorization Machines
- spragunr: Theano-based implementation of the deep Q-learning algorithm
Theory, machine learning algorithms and code samples
Mapping Your Music Collection - in this article we'll explore a neat way of visualizing your MP3 music collection. The end result will be a hexagonal map of all your songs, with similar sounding tracks located next to each other. The color of different regions corresponds to different genres of music (e.g. classical, hip hop, hard rock). As an example, here's a map of three albums from my music collection: Paganini's Violin Caprices, Eminem's The Eminem Show, and Coldplay's X&Y.
Factor Analysis - factor analysis (FA) is a latent variable model that describes the variability of a given dataset. It was developed by the psychologists Charles Spearman, Raymond Catell, and Louis Leon Thurstone.
Base R Plots
ML Pitfalls: Measuring Performance (Part 1) - Unfortunately, analysis lives and dies by self-reported metrics. Is this feature A better than feature B? Is this classifier better than another? How much confidence can I have in this financial report? From the development to the consumption, almost every decision regarding analytics inherently asks "How good is this model?"
- Gradient Descent Training Using C#
- Understanding Natural Language with Deep Neural Networks Using Torch - anyone who starts investigating ML quickly encounters the somewhat mysterious phrase “gradient descent.” In this article, James McCaffrey will explain what gradient descent is and demonstrate how to use it to train a logistic regression classification system.
Interactive Data Visualization with D3.js, DC.js, Python, and MongoDB - data visualization plays an important role in data analysis workflows. It enables data analysts to effectively discover patterns in large datasets through graphical means, and to represent these findings in a meaningful and effective way. Data visualization is an interdisciplinary field, which requires design, web development, database and coding skills.
- Calculate PageRanks with Apache Hadoop
Introduction to Machine Learning with Python and Scikit-Learn
Online courses, learning materials and literature
Online-course: Deep Learning for Natural Language Processing - natural language processing (NLP) is one of the most important technologies of the information age. Understanding complex language utterances is also a crucial part of artificial intelligence. Applications of NLP are everywhere because people communicate most everything in language: web search, advertisement, emails, customer service, language translation, radiology reports, etc.
Online-course: Introduction to Computational Thinking and Data Science - 6.00.2x is aimed at students with some prior programming experience in Python and a rudimentary knowledge of computational complexity. We have chosen to focus on breadth rather than depth. The goal is to provide students with a brief introduction to many topics, so that they will have an idea of what’s possible when the time comes later in their career to think about how to use computation to accomplish some goal.
Online-course: The Analytics Edge - in the last decade, the amount of data available to organizations has reached unprecedented levels. Data is transforming business, social interactions, and the future of our society. In this course, you will learn how to use data and analytics to give an edge to your career and your life. We will examine real world examples of how analytics have been used to significantly improve a business or industry.
Online-course: Data Analysis and Statistical Inference - this course introduces you to the discipline of statistics as a science of understanding and analyzing data. You will learn how to effectively make use of data in the face of uncertainty: how to collect data, how to analyze data, and how to use data to make inferences and conclusions about real world phenomena.
Book review: About Time Series Databases and a New look at Anomaly detection by Ted Dunning and Ellen Friedman - this blog post is a review of two books. Both are available for free from the MapR site, written by Ted Dunning and Ellen Friedman (published by O Reilly)
Free online-book: Kalman and Bayesian Filters in Python
Free Big Data Analytics Handbook - Brian Liou from Leada was kind enough to provide a guest post about their latest handbook, The Data Analytics Handbook: Big Data Edition.
Videos, podcasts
Deep Learning at Flickr, Pierre Garrigues - Pierre Garrigues is a Researcher in Machine Perception and Learning at Flickr and also spoke at the Deep Learning Summit at the end of January to give an insight into how Flickr are automating the labelling of their image libraries using Deep Learning techniques as well as the 10 million uploads which they receive each day.
Partially Derivative: Episode 16: Algorithm Aversion - this week the team talks about Jonathon's new ISIS analysis, iPython 3, Indian Food, Algorithm Aversion, and more!
Data engineering
- Taming Apache Storm for Real-Time Analytics - Apache Storm is gaining a foothold among organizations looking to do real-time analytics on streaming data. However, the difficulty in working with the distributed processing framework is proving to be a major hurdle to Storm adoption. Now, a company called Impetus says it’s simplifying development on Storm with a new product.
- Using MongoDB with Hadoop & Spark: Part 2 - Hive Example
- Using MongoDB with Hadoop & Spark: Part 3 - Spark Example & Key Takeaways
Digests
- Top stories for Feb 22-28: Gartner 2015 MQ for Advanced Analytics: gainers and losers; History of Data Science Infographic (KDnuggets.com)
- Top stories in February: 10 things statistics taught about big data; Gartner Analytics MQ: gainers and losers (KDnuggets.com)
- Weekly Digest - March 9 (DataScienceCentral.com)
- Data Science News 8 March 2015 (MyDataMine.com)
- Big Data News 5 March 2015 (MyDataMine.com)
- Issue 25 - March 6th 2015 (DataElixir.com)
- This Week in Data (March 6, 2015) - r1soft.com
This Month in the Ecosystem (February 2015) - Cloudera.com
Weekly Hadoop News 3 Mac 2015 (MyDataMine.com)
- Stuff The Internet Says On Scalability For March 6th, 2015 (HighScalability.com)
Previous digest: Data science digest #39 (23 February - 1 March 2015)
All data science digests: Data science digests
Nice Blog!!Thank You....
ReplyDeleteWeb Designing Company in Bangalore | Website Development Company in Bangalore | Web Design Company in Bangalore
I wanted to thank you for this great read!! I definitely enjoying every little bit of it I have you bookmarked to check out new stuff you post.is article.
ReplyDeletedotnet online training
very informative blog and useful article thank you for sharing with us , keep posting learn more about aws with cloud computing
ReplyDeleteAWS Online Training
AI Training
Thanks for your great information. I like this topic. This site has lots of advantage. We are top Mobile App Development | Mobile App Development Company in India | Website Development Company in Delhi | Web Designing Company in Gurgaon.
ReplyDelete