Promising, practical ... and, as with so many applications of massive data collection and analysis, maybe a little perturbing. This post is primarily for the materials scientist in the family, but it should be interesting to anyone.
Scientists at MIT and Berkeley, using Artificial Intelligence algorithms to pore over abstracts from papers related to materials science, have successfully predicted scientific discoveries.
Researchers from the Lawrence Berkeley National Laboratory used an algorithm called Word2Vec sift through scientific papers for connections humans had missed. Their algorithm then spit out predictions for possible thermoelectric materials. ... The algorithm didn’t know the definition of thermoelectric, though. It received no training in materials science. Using only word associations, the algorithm was able to provide candidates for future thermoelectric materials.
Using just the words found in scientific abstracts, the algorithm was able to understand concepts such as the periodic table and the chemical structure of molecules. The algorithm linked words that were found close together, creating vectors of related words that helped define concepts. In some cases, words were linked to thermoelectric concepts but had never been written about as thermoelectric in any abstract they surveyed. This gap in knowledge is hard to catch with a human eye, but easy for an algorithm to spot.
In one experiment, researchers analyzed only papers published before 2009 and were able to predict one of the best modern-day thermoelectric materials four years before it was discovered in 2012.
This new application of machine learning goes beyond materials science. Because it’s not trained on a specific scientific dataset, you could easily apply it to other disciplines, retraining it on literature of whatever subject you wanted.
Here's an article from MIT that's a bit more technical.
MIT and Berkeley may be doing this particular research, but anyone want to guess where the Word2vec algorithm was developed?
Google.