March was a busy month, especially for developers working with GPT-3. After surprising everybody with its ability to write code, it’s not surprising that GPT-3 is appearing in other phases of software development. One group has written a tool that creates regular expressions from verbal descriptions; another tool generates Kubernetes configurations from verbal descriptions. In his newsletter, Andrew Ng talks about the future of low-code AI: it’s not about eliminating coding, but eliminating the need to write all the boilerplate. The latest developments with large language models like GPT-3 suggest that the future isn’t that distant.
On the other hand, the US copyright office has determined that works created by machines are not copyrightable. If software is increasingly written by tools like Copilot, what will will this say about software licensing and copyright?
Learn faster. Dig deeper. See farther.
- An unusual form of matter known as spin glass can potentially allow the implementation of neural network algorithms in hardware. One particular kind of network allows pattern matching based on partial patterns (for example, face recognition based on a partial face), something that is difficult or impossible with current techniques.
- OpenAI has extended GPT-3 to do research on the web when it needs information that it doesn’t already have.
- Data-centric AI is gaining steam, in part because Andrew Ng has been pushing it consistently. Data-centric AI claims that the best way to improve the AI is to improve the data, rather than the algorithms. It includes ideas like machine-generated training data and automatic tagging. Christoper Ré, at one of the last Strata conferences, noted that data collection was the part of AI that was most resistant to improvement.
- We’ve seen that GPT-3 can generate code from English language comments. Can it generate Kubernetes configurations from natural language descriptions? Take a look at AI Kube Bot.
- The US Copyright Office has determined that works created by an artificial intelligence aren’t copyrightable; copyright requires human authorship. This is almost certainly not the final word on the topic.
- A neural network with a single neuron that is used many times may be as effective as large neural networks, while using much less energy.
- Training AI models on synthetic data created by a generative model can be more effective than using real-world data. Although there are pitfalls, there’s more control over bias, and the data can be made to include unexpected cases.
- For the past 70 years, computing has been dominated by general-purpose hardware: machines designed to run any code. Even vector processors and their descendants (GPUs) are fundamentally general purpose. The next steps forward in AI may involve software, hardware, and neural networks that are designed for each other.
- Ithaca is a DeepMind project that uses deep learning to recover missing texts in ancient Greek documents and inscriptions. It’s particularly interesting as an example of human-machine collaboration. Humans can do this work with 25% accuracy, Ithaca is 62% accurate on its own, but Ithaca and humans combined reach 72% accuracy.
- Michigan is starting to build the infrastructure needed to support autonomous vehicles: dedicated lanes, communications, digital signage, and more.
- Polycoder is an open source code generator (like Copilot) that uses GPT-2, which is also open sourced. Developers claim that Polycoder is better than Copilot for many tasks, including programming in C. Because it is open-source, it enables researchers to investigate how these tools work, including testing for security vulnerabilities.
- New approaches to molecule design using self-supervised learning on unlabeled data promise to make drug discovery faster and more efficient.
- The title says it all. Converting English to Regular Expressions with GPT-3, implemented as a Google sheet. Given Copilot, it’s not surprising that this can be done.
- Researchers at MIT have developed a technique for injecting fairness into a model itself, even after it has been trained on biased data.
- Low code programming for Python: Some new libraries designed for use in Jupyter Notebooks (Bamboo, Lux, and Mito) allow a graphical (forms-based) approach to working with data using Python’s Pandas library.
- Will the Linkerd service mesh displace Istio? Linkerd seems to be simpler and more attractive to small and medium-sized organizations.
- The biggest problem with Stack Overflow is the number of answers that are out of date. There’s now a paper studying the frequency of out-of-date answers.
- Silkworm-based encryption: Generating good random numbers is a difficult problem. One surprising new source of randomness is silk. While silk appears smooth, it is (not surprisingly) very irregular at a microscopic scale. Because of this irregularity, passing light through silk generates random diffraction patterns, which can be converted into random numbers.
- The Hub for Biotechnology in the Built Environment (HBBE) is a research center that is rethinking buildings. They intend to create “living buildings” (and I do not think that is a metaphor) capable of processing waste and producing energy.
- A change to the protein used in CRISPR to edit DNA reduces errors by a factor of 4000, without making the process slower.
- Researchers have observed the process by which brains store sequences of memories. In addition to therapies for memory disorders, this discovery could lead to advances in artificial intelligence, which don’t really have the ability to create and process timelines or narratives.
- Object detection in 3D is a critical technology for augmented reality (to say nothing of autonomous vehicles), and it’s significantly more complex than in 2D. Facebook/Meta’s 3DETR uses transformers to build models from 3D data.
- Some ideas about what Apple’s AR glasses, Apple Glass, might be. Take what you want… Omitting a camera is a good idea, though it’s unclear how you’d make AR work. This article suggests LIDAR, but that doesn’t sound feasible.
- According to the creator of Pokemon Go, the metaverse should be about helping people to appreciate the physical world, not about isolating them in a virtual world.
- Jeff Carr has been publishing (and writing about) dumps of Russian data obtained by hackers from GRUMO, the Ukraine’s cyber operations team.
- Sigstore is a new kind of certificate authority (trusted root) that is addressing open source software supply chain security problems. The goal is to make software signing “ubiquitous, free, easy, and transparent.”
- Russia has created its own certificate authority to mitigate international sanctions. However, users of Chrome, Firefox, Safari, and other browsers originating outside of Russia would have to install the Russian root certificate manually to access Russian sites without warnings.
- Corporate contact forms are replacing email as a vector for transmitting malware. BazarBackdoor [sic] is now believed to be under development by the Conti ransomware group.
- Dirty Pipe is a newly discovered high-severity bug in the Linux kernel that allows any user to overwrite any file or obtain root privileges. Android phones are also vulnerable.
- Twitter has created an onion service that is accessible through the Tor network. (Facebook has a similar service.) This service makes Twitter accessible within Russia, despite government censorship.
- The attackers attacked: A security researcher has acquired and leaked chat server logs from the Conti ransomware group. These logs include discussions of victims, Bitcoin addresses, and discussions of the group’s support of Russia.
- Attackers can force Amazon Echo devices to hack themselves. Get the device to speak a command, and its microphone will hear the command and execute it. This misfeature includes controlling other devices (like smart locks) via the Echo.
- The Anonymous hacktivist collective is organizing (to use that word very loosely) attacks against Russian digital assets. Among other things, they have leaked emails between the Russian defense ministry and their suppliers, and hacked the front pages of several Russian news agencies.
- The Data Detox Kit is a quick guide to the bot world and the spread of misinformation. Is it a bot or not? This site has other good articles about how to recognize misinformation.
- Sensor networks that are deployed like dandelion seeds! An extremely light, solar-powered framework for scattering of RF-connected sensors and letting breezes do the work lets researchers build networks with thousands of sensors easily. I’m concerned about cleanup afterwards, but this is a breakthrough, both in biomimicry and low-power hardware.
- Semiconductor-based LIDAR could be the key to autonomous vehicles that are reasonably priced and safe. LIDAR systems with mechanically rotating lasers have been the basis for Google’s autonomous vehicles; they are effective, but expensive.
- The open source instruction set architecture RISC-V is gaining momentum because it is enabling innovation at the lowest levels of hardware.
- Microsoft claims to have made a breakthrough in creating topological qubits, which should be more stable and scalable than other approaches to quantum computing.
- IBM’s quantum computer was used to simulate a time crystal, showing that current quantum computers can be used to investigate quantum processes, even if they can’t yet support useful computation.
- Mozilla has published their vision for the future evolution of the web. The executive summary highlights safety, privacy, and performance. They also want to see a web on which it’s easier for individuals to publish content.
- Twitter is expanding its crowdsourced fact-checking program (called Birdwatch). It’s not yet clear whether this has helped stop the spread of misinformation.
- The Gender Pay Gap Bot (@PayGapApp) retweets corporate tweets about International Womens’ Day with a comment about the company’s gender pay gap (derived from a database in the UK).
- Alex Russell writes about a unified theory of web performance. The core principle is that the web is for humans. He emphasizes the importance of latency at the tail of the performance distribution; improvements there tend to help everyone.
- WebGPU is a new API that gives web applications the ability to do rendering and computation on GPUs.