Washington: A team led by an Indian-American scientist has begun to describe theoretical limits on the degree of imprecision that communicating computers can tolerate – with very real implications for the design of communication protocols.
Communication protocols for digital devices are very efficient but also very brittle. These require information to be specified in a precise order with a precise number of bits.
If sender and receiver – say a computer and a printer – are off by even a single bit relative to each other, communication between them breaks down entirely.
Madhu Sudan, an adjunct professor of electrical engineering and computer science at Massachusetts Institute of Technology (MIT) and a principal researcher at Microsoft Research New England, wants to bring human-type flexibility to computer communication.
One thing that humans do well is gauging the minimum amount of information they need to convey in order to get a point across. Depending on the circumstances, for instance, one co-worker might ask another, “Who was that guy?”; “Who was that guy in your office?”; “Who was that guy in your office this morning?”; or “Who was that guy in your office this morning with the red tie and glasses?”
Similarly, the first topic Sudan and his colleagues began investigating is compression, or the minimum number of bits that one device would need to send another in order to convey all the information in a data file. Existing data compression schemes do exploit statistical regularities in data.
In a paper to be presented at the next “Innovations in Theoretical Computer Science” (ITCS) conference in January, Sudan and colleagues at Columbia University, Carnegie Mellon University, and Microsoft add even more uncertainty to their compression model.
In the new paper, not only do sender and receiver have somewhat different probability estimates but they also have slightly different codebooks. In the new study, researchers were able to devise a protocol that would still provide good compression. They also generalised their model to new contexts.
For instance, Sudan says, in the era of cloud computing, data is constantly being duplicated on servers scattered across the internet, and data-management systems need to ensure that the copies are kept up to date.
One way to do that efficiently is by performing “checksums” or adding up a bunch of bits at corresponding locations in the original and the copy and making sure the results match, he concluded.