Radar trends to watch: December 2020
Trends in AI, Robotics, Infrastructure, and more.
This month’s collection of interesting articles that point to important trends is dominated by AI. That’s not surprising; AI has probably been the biggest single category all year. But its dominance over other topics seems to be increasing. That’s partly because there’s more research into why AI fails; partly because we’re beginning to see AI in embedded systems, ranging from giant gas and oil wells to the tiny devices that Pete Warden is working with.
Artificial Intelligence
- Teaching AI to manipulate and persuade: Combine NLP with reinforcement learning, and train in a multiplayer role-playing game. This is where AI gets scary, particularly since AI systems don’t understand what they’re doing (see the next item).
- GPT-3 is great at producing human-like language, but that’s as far as it goes; it has no sense of what an appropriate response to any prompt might be. For example, suggesting suicide as a solution to depression. This isn’t a surprise, but it means that GPT-3 really can’t be incorporated into applications.
- Why machine learning models fail in the real world, and why it’s a very difficult problem to fix: Any set of training data can lead to a huge number of models with similar behavior on the training data, but with very different performance on real-world data. Deciding which of these models is “best” (and in which situations) is a difficult, and unstudied, problem.
- Tiny NAS: Neural Architecture Search designed to automate building Tiny Neural Networks. Machine Learning on small devices will be an increasingly important topic in the coming years.
- Pete Warden on the future of TinyML: There will be hundreds of billions of devices in the next few years. Many of them won’t be “smart”; they’ll be more intelligent versions of dumb devices. We don’t need “smart refrigerators” that can order milk automatically, but we do need refrigerators that can use energy more efficiently and notify us when they’re about to fail.
- The replication crisis in AI: Too many academic AI papers are published without code or data, and using hardware that can’t be obtained by other researchers. Without access to code, data, and hardware, academic papers about groundbreaking results are little more than corporate marketing.
- Machine learning to detect gas leaks: Granted, this is for oil-well scale natural gas leaks, but we should all be more aware of these invisible applications of machine learning. It’s not just autonomous vehicles and face recognition. And lest we forget, invisible applications of ML also have problems with bias, fairness, and accountability.
- Vokens: What happens when you combine computer vision with natural language processing? Is it possible to isolate the meaningful elements in a picture, then use that to inform language models like GPT-3 to add an element of “common sense”?
- Using AI to diagnose COVID-19 via coughs: MIT has developed an AI algorithm that detects features in a cough that indicate a COVID-19 infection. It is at least as accurate as current tests, particularly for asymptomatic people, provides results in real time, and could easily be built into a cell phone app.
- Over time, models in feedback loops (e.g., economic competition) tend to become more accurate for a narrower slice of the population, and less accurate for the population as a whole. Essentially, a model that is constantly retraining on current input will, over time, make itself unfair.
Robotics
- Robots in construction: The construction industry has been resistant to automation. Canvas has built a robot that installs drywall. This robot is in use on several major sites, including the renovation of the Harvey Milk terminal at San Francisco Airport.
- Simplifying the robot’s model of the external world is the route to better collaborations between robots and humans.
- Honda wins approval to sell a level-3 autonomous vehicle. The vehicle is capable of completely taking over driving in certain situations, not just assisting. It should be on sale before March.
Programming
- Nbdev is a literate programming environment for Python. It is based on Jupyter, but encompasses the entire software lifecycle and CI/CD pipeline, not just programming.
- A visual programming environment for GraphQL is another step in getting beyond text-based programming. A visual environment seems like an obvious choice for working with graph data.
- PHP 8 is out! PHP is an old language, and this release isn’t likely to put it onto the “trendy language” list. But with a huge portion of the Web built with PHP, this new release is important and definitely worth watching.
Privacy and Security
- Google is adding end-to-end encryption to their implementation of RCS, which is a standard designed to replace SMS messaging. RCS hasn’t been adopted widely (and, given the dominance of the telephone system, may never be adopted widely), but standards for encrypted messaging are an important step forward.
- Tim Berners-Lee’s privacy project, Solid, has released its first project: an organizational privacy server. The idea behind Solid is that people (and organizations) store their own data in secure repositories called Pods that they control. Bruce Schneier has joined Inrupt, the company commercializing Solid.
- CMU has shown that passwords with minimum length of 12 characters and that pass some simple tests can be remembered and resist attack. We can move on from password policies that require obscure combinations of upper and lowercase, punctuation and numerals, and that don’t require changing passwords regularly.
Infrastructure
- Remember DNS cache poisoning? It’s back. Unfortunately.
- A public mesh WiFi network for New York City: Mesh networks can provide Internet access in locations where established providers don’t care to go–but making them work at scale is difficult. Technology we first heard about in Cory Doctorow’s very strange Someone Comes To Town, Someone Leaves Town.
- Hyper-scale indexing: Helios is Microsoft’s reference architecture for the next generation of cloud systems. It is capable of handling extremely large data sets (even by modern standards) and combines centralized cloud computing with edge computing.
Hardware
- The Raspberry Pi 400 looks like a LOT of fun. It’s Raspberry Pi 4 built into a keyboard (like the very early Personal Computers); 1.8 GHz ARM processor, 4 GB RAM, more I/O ports than a MacBook Pro; just needs a monitor. I just hope the keyboard is good.
- I should say something positive about Apple’s M1, but I won’t. I’m disenchanted enough with them as a company that I really don’t care how good the processor is.
Covid
- Amazon reviews about scented candles that don’t smell correlate to Covid. A nice application of data analysis using publicly available sources. Data science wins.