Sunday, 10 April 2011

Superbrain Watson looking for work

English Lessons; Jeopardy-winning computer aimed at Holy Grail of AI
 
 
 
First came Jeopardy. The next step may be Star Trek.
One of the designers of IBM's Watson super-computer said the goal for the next version of the machine is to seamlessly understand human language and answer questions, like the ever-present computer that responds to the commands of Captains Kirk and Picard on the Star Trek television series.
"That's the holy grail of AI (Artificial Intelligence)," said Aditya Kalyanpur, a researcher at IBM's Thomas J. Watson Institute.
Kalyanpur, one of the engineers who designed Watson's brain, said the computer has achieved a remarkable breakthrough in the ability to answer questions phrased in complex English. It paves the way for a computer that would understand all human phrases and produce answers to any question, by sifting through huge stores of data.
Watson was designed to play Jeopardy - as a way to showcase advanced language analytical technology and problem solving. In February, the computer faced off against two of the greatest Jeopardy champions in the show's history and won cleanly. But it was a lot of work to get to that point. Kalyanpur said the first version of the computer took two hours to produce just one answer. During the quiz show, the average response time was three seconds.
"People have been really amazed," said Kalyanpur, who spoke Thursday at the Palais des congrès as part of the Crystal Ball conference on IT, organized by the Centre de recherche informatique de Montréal. "Jeopardy was actually a really good demo of this technology. It was a way to engage the average person so they can understand what the challenges are with language recognition."
He said the computer's greatest achievement was to be able to analyze human wordplay, and figure out what was being asked in the tricky wordplay used in the show. The computer then had to determine how sure it was of its answer, so that if it was not so confident, it would not buzz in and risk answering wrong and losing money.
Now that the Jeopardy experiment was a success, Watson is looking for work, Kalyanpur said. The supercomputer has been touted as a tool to help doctors make accurate diagnoses.
"We're looking at a host of different application areas," Kalyanpur said. "Health care is one big one, but also enterprise intelligence, government, legal and tech support. Any domain where there is a lot of information that is unstructured, and you want to make sense of that information.
"We're still working out the mode of deployment. We're working on making a miniaturized version of Watson. Something else that might work out is a cloud model. Where all the hardware is in the cloud, and the user just has access to the question answering and the text analytic capability."
jmagder@ montrealgazette.com


Read more: http://www.montrealgazette.com/entertainment/Superbrain+Watson+looking+work/4579446/story.html#ixzz1J79t5bvC

Sunday, 3 April 2011

LexisNexis Unveils Next Generation of IP Research Technology with New Semantic Search “Brain”

NEW YORK - LexisNexis, a leading global provider of content-enabled workflow solutions, announced in a press release the debut of an innovative new semantic search “brain” for its full complement of intellectual property (IP) research products.
The next-generation semantic search technology identifies the meaning of multiple concepts within a single search query to help users zero in on core concepts faster and make fewer revisions to their search queries.
The technology will power the patent research and retrieval service LexisNexis® TotalPatent™, the automated patent application and analysis product LexisNexis® PatentOptimizer™, and IP research across patent and non-patent literature conducted on the flagship lexis.com® online legal research service.
Semantic search uses the science of meaning in language (“semantics”) to produce highly relevant search results. LexisNexis launched its semantic search technology 18 months ago, significantly enhancing the search process for patent researchers through technology that delivers results based on an analysis of the meaning of the language used in search queries – not just the words themselves.
The new semantic search technology takes this science to the next level by enhancing its ability to identify multiple concepts contained within a single search query. Thus, if a patent researcher asks the LexisNexis search engine to find information about a complex subject, the new semantic brain will actually identify various possible ideas contained in that request and return related concepts for each idea in their query. The researcher can then review the concepts suggested, assign relative importance by weighting them, eliminate concepts that aren’t related, and even add more concepts they think might be useful to the search project.
“We believe that the most important brain in the patent research process is the researcher’s own brain,” said Steven Errick, vice president of Research Information at LexisNexis. “When the user experience is combined with the semantic search capability, it becomes a powerful tool that can deliver the most precise and relevant patent search results available in the industry.”
LexisNexis also introduced a series of enhancements to its award-winning TotalPatent service. The most notable addition is a new “Visualize & Compare” tool that allows users to compare and analyze any two or three result sets or lists of patents, regardless of the underlying search mechanism, for example a comparison between a Boolean search result and a semantic search result.
The new comparison capability not only highlights documents that were uniquely surfaced in one query or list versus another, but also serves as an important tool to assist researchers in analyzing and improving their overall search strategy and queries to find the most precise documents. The comparison tool will also give patent researchers greater confidence that they have executed the most comprehensive search possible, thereby lowering the risk of missing crucial documents.

Saturday, 2 April 2011

Health Advice by Dr. Weiss: Artificial intelligence in medicine

“Nothing endures but change,” said a Greek philosopher, and healthcare is changing rapidly in so many ways as we better utilize technology.
The next major frontier may well be the manner in which patients share information with their physicians, physician assistants, and nurse practitioners.
Picture yourself as a patient sharing the details of a recent illness with a computer, which listens patiently and responds in perfect English (or whatever language you are most comfortable with) by asking pertinent questions. As you give your history, the computer weighs the odds of possible diagnoses along with appropriate therapies.
This scenario used to sound like science fiction, but is now being developed by IBM’s Watson computer. Watson made a huge public splash recently when it—I don’t know whether to say “it” or “he” or “she”—impressively played Jeopardy! against the best human players in the game show’s history. Watson was not connected to the Internet but was able to understand English, deal with most of the nuances of our language, quickly search through enormous databases, and answer in the appropriate manner.
The real utility of Watson is not its ability to win at TV game shows, but rather the fact that Watson can change the way we practice medicine. Here’s why:
• Watson possesses strong communication skills;
• It combines that with the ability to search immense databases almost instantly;
• And then logically comes up with correct solutions to problems.
Imagine patient Jones or physician Smith speaking with Watson about a clinical situation.
All the already-digitized forms of information about Ms. Jones—such as laboratory results, vital signs and past medical history—would be readily available and easily inputted.
Dr. Smith or a mid-level provider could add the results of a current physical exam, and Watson would begin to calculate the probabilities of having a specific disease.
Cornerstone of Care
Watson would have the luxury in this situation of asking additional questions, suggesting other tests be completed, perhaps recommending a therapeutic trail, or just observing. The physician would remain the cornerstone of care. But now he would have a powerful ally.
Today, a physician does exactly what Watson will be doing. But the physician depends on his/her memory and knowledge base which is a small fraction of what Watson utilizes. And once Watson is functional for one patient, the cost of diagnosing and caring for additional patients is minimal.
Dr. Herbert Chase at the College of Physicians and Surgeons of Columbia University (my alma mater) has welcomed Watson to campus. Winning Jeopardy! is entertaining, but changing medicine is serious business. As the complexity of healthcare has grown along with the costs which need to be managed, Watson—or son of Watson—may be the solution.
Think back to other revolutionary solutions such as the invention of gunpowder, Gutenberg’s printing press, the airplane, the Internet, and how each one changed civilization. We are now in the middle of a healthcare computer revolution which will have a more profound effect on healthcare reform than the current acrimonious national debate about “Accountable Care.”
Computer technology and the Internet have changed the way so many things function. Consider just a few venues:
• Political revolutions: Egypt’s dramatic change was fueled by Facebook and Twitter.
• Dating: Online matchmaking spawned one in eight marriages last year.
• Higher education: The University of Phoenix, online, has by far the largest number of MBA students in the country.
• Financial services: Instant updates and rapid communication with ubiquitous email.
And then we come to healthcare.
• There are many disease-centered websites where patients can log in as themselves (or anonymously with a pseudonym) to share experiences, provide support, keep up on the latest treatments, and most importantly, be active participants in their care.
• The old paternalistic model of care where “Father Knows Best” has been turned upside down as patients aggregate knowledge on sites such as “Patients Like Me”— http://www.patientslikeme.com/ —where you can search for patients with the same disease or symptom complex, get answers to questions, and help others.
Watson could help a patient in the privacy of his or her own home, changing that person from a passive to an active participant. That’s a game changer. Think about the disparity of care for those who are insured vs. those who are uninsured and impoverished. Once Watson is up and running, each additional patient cared for means only a small additional cost.
Collaborative Healthcare
The Internet is global and seamless when it comes to boundaries. Watson will surely understand all the world’s languages; many are already in the voice recognition vocabulary of existing software which has been used for medical transcription for more than a decade.
In fact, I was using voice recognition in my medical office successfully from 1998 to 2000 when I left my solo office practice to work with the NCH Healthcare System. Back in that “last century” (a little more than a decade ago!), new patients could leave my office with a copy of their consultation note in hand. It had been created with the patient watching me dictate their history, physical, my impressions and the therapeutic plan into the medical record while using a microphone and a commercially available computer.
The internet will also encourage collaborative healthcare. For example, patient-produced content would be shared much the same way that Wikipedia shares content generated by contributors.
This shift—from a small group having knowledge to shared, participatory involvement—parallels the change from the Middle Ages to the Renaissance (which was catalyzed by Gutenberg’s press). We now have the equivalent, namely the Internet. Add Watson as the interface and health care can be changed in profound ways that were unimaginable just a few years ago. And to be frank, some of these changes will be unwelcome by those whose previous positions of power and importance are being undermined.
Nonetheless, when huge population studies can be done quickly, and vast data troves can be aggregated and shared, the entire global community will better understand illness and disease. The whole idea of evidence-based medicine will mature much faster as everyone shares medical experiences.
Privacy and confidentiality issues will always be valid concerns, but there are good working models in the banking and financial industry of how to safeguard privacy.
Finally, will everyone have a personal home health page started at birth (or conception) to carry with them throughout their lives? This is an interesting concept, and the technology backbone for it is already in place.
Having your medical information in one place will empower you to watch your weight over decades, monitor your cholesterol as you modify your eating habits, predict when you will have problems and make changes in the 70% of your habits which cause illness and which you can control. Today you can go to a site called Real Age, take a short quiz about your past health, and see how your chronologic age compares to your physiological age. http://www.realage.com/landing/entry4?cbr=MSNSRCH025
I believe that technology, the internet, and Watson will change the face of healthcare forever. We will probably resist change for a while, because that’s typical of human nature. But when you consider the enormous rate of change of just the last two decades, I think you’ll have an idea of how quickly we can evolve.
“Dr. Watson” was Sherlock Holmes’ companion and narrator. Now Watson takes on a new role.
Allen S. Weiss, M.D., President and CEO, NCH Healthcare System
Please feel free to share this article or contact me here:
http://www.nchmd.org/healthadvice
To view past issues of Health Advice by Dr. Weiss click:
http://www.nchmd.org/healthadvicearchive

Friday, 1 April 2011

Putting the "Smart" In Smartphone: Iphone 5 Might Include AI

A very interesting rumor about the iPhone 5 has been going around. It involves a little AI called Siri.  The news, first circulated by TechCrunch, involves Apple's purchase last year of the artificial intelligence, which takes the form of a virtual personal assistant. Either they've been sitting on the technology since its acquisition, or they were simply waiting for the right moment to announce how they plan to put it to use.
Knowing Apple, it's probably the latter. 

"You’re busy. Between meetings, social events, and hopefully a workout or two, your schedule’s packed. Don’t you wish you could hand off simple tasks so you could have more time to play? That’s why we built Siri. Because we believe everyone could use an assistant. Because we believe there’s a simpler way to get things done. Just like a real assistant, Siri understands what you say, accomplishes tasks for you and adapts to your preferences over time." (Siri.com).
Siri, originally developed as an iPhone app, was the first product created by a company of the same name; one which focuses exclusively on artificial intelligence. In April 2010, Siri evidently caught Apple's eye, and they purchased the product off of the company for their own use. The original app was pretty much what was written on the box- a virtual assistant that helped you figure out what movies to watch, suggested restaurants, found available taxi services, et-cetera.
How it worked was that it connected to Siri's servers on the web, which included a wide array of information services such as Taximagic and Rotten Tomatoes. What's more, Siri's designers promise that "right now, Siri’s learning how to handle reminders, flights stats and reference questions. Our vision is that, over time, you’ll trust Siri to manage many personal details in your life - from recommending a wine you might enjoy to managing your to do list."

Talk To Your Phone
It seems pretty clear what all this means when you put everything together. Though things aren't quite set in stone yet, Apple's evidently looking to completely integrate Siri into the iPhone 5. This innovation takes mobile communication to a whole new level- If Siri evolves along the lines that its creators have planned for it, it'll be more than just an information service, or a passive virtual assistant- it'll be a full blown entity. Your phone will be able to independently manage your schedule, organize your to do list, and remind you of important appointments, all the while responding to your questions and concerns much as a human secretary might. 
Very cool indeed- I'm quite interested in seeing how this will turn out.  We'll find out more at the Apple Worldwide Developer's Conference, scheduled for June 6-10 in San Francisco.Until then, we'll just have to be patient. 

Wednesday, 30 March 2011

Its all in the name, iRobot!!

iRobot Corp. designs robots that perform dull, dirty or dangerous missions in a better way. The company’s proprietary technology, iRobot AWARE, Robot Intelligence Systems, incorporates advanced concepts in navigation, mobility, manipulation and artificial intelligence. This proprietary system enables iRobot to build behavior-based robots, including its family of consumer and military robots.

Tuesday, 29 March 2011

Could AI be used to launch and guide rockets?

Could artificial intelligence help rockets launch themselves? With greater automation, rockets would be capable of self-checking for problems, self-diagnosing, and hopefully, fixing minor pre or post launch issues.
"So far, rockets are merely automatic. They are not artificially intelligent," said Yasuhiro Morita, a professor at Institute of Space and Astronautical Science at JAXA, Japan's aerospace organisation.


However, according to Morita, the Epsilon launch vehicle - tentatively scheduled for a 2013 launch - is slated to include a whole new level of automation.
Currently, modern rockets have some elements of automation, for example, sensors can alert engineers of malfunctions but can't do much to alert them what type of problem it is or what type of solution is needed.
But in the future?
"The AI will diagnose the condition of the rocket, but it is more than that," Morita said. Should there be an issue, "the AI system will determine the cause of a malfunction," and potentially fix the problem itself.

Monday, 28 March 2011

CBA develops system to combat money laundering and terrorism financing

“CBA attaches importance to comprehensive management, safe maintenance and efficient analysis of information related to money laundering and terrorism financing,” CBA Chairman Arthur Javadyan said during the event.

He expressed gratitude to the U.S. authorities for the technical aid to introduce the system and expressed hope that the cooperation between the CBA and U.S. embassy will be continued through implementation of other programs on development of the system on struggle against money laundering and terrorism financing.
U.S. Ambassador to Armenia Marie Yovanovitch said for her part that the Automated Management System is an important project for tackling money laundering and terrorism financing. The most efficient method of the struggle against money laundering is to deprive criminals of the opportunity to manage their incomes, she said.

Representatives of the Office of RA Prosecutor General, MFA, Police, National Security Service, Union of Armenian Banks and AI Partnership organization also participated in the event.

Saturday, 26 March 2011

Artificial Intelligence: Job Killer


A recent article in the New York Times points out that sophisticated data analytics software is doing the kinds of jobs once reserved for highly paid specialists. Specifically, it talks about data mining-type software applied to document discovery for lawsuits. In this realm, these applications are taking the place of expensive teams of lawyers and paralegals.
Basically it works by performing deep analysis of text to find documents pertinent to the case at hand. It's not just a dumb keyword search; the software is smart enough to find relevant text even in the absence of specific terms. One application was able to analyze 1.5 million documents for less than $100,000 -- a fraction of the cost of a legal team, and performed in a fraction of the time.
Mike Lynch, founder of Autonomy (a UK-based e-discovery company), thinks this will lead to a shrinking legal workforce in the years ahead. From the article:
He estimated that the shift from manual document discovery to e-discovery would lead to a manpower reduction in which one lawyer would suffice for work that once required 500 and that the newest generation of software, which can detect duplicates and find clusters of important documents on a particular topic, could cut the head count by another 50 percent.
Such software can also be used to connect chains of events mined from a variety of sources: e-mail, instant messages, telephone calls, and so on. Used in this manner, it can be used to sift out digital anomalies to track various types of criminal behavior. Criminals, of course, are one workforce we'd like to reduce.  But what about the detectives that used to perform this kind of work?
The broader point the NYT article illuminates is that software like this actually targets mid-level white collar jobs, rather than low-end labor jobs we usually think of as threatened by computer automation. According to David Autor, an economics professor at MIT, this is leading to a "hollowing out" of the US economy. While he doesn't think technology like this is driving unemployment per se, he believes the job mix will inevitably change, and not necessarily for the better.
It's the post-Watson era. Get used to it.

Saturday, 19 March 2011

The past, present and future of cancer

Leading cancer researchers reflected on past achievements and prospects for the future of cancer treatment during a special MIT symposium on Wednesday titled “Conquering Cancer through the Convergence of Science and Engineering.”

The event, one of six academic symposia taking place as part of MIT’s 150th anniversary, focused on the Institute’s role in studying the disease over the past 36 years since the founding of MIT’s Center for Cancer Research.

During that time, MIT scientists have made critical discoveries that resulted in new cancer drugs such as Gleevec and Herceptin. The center has since become the David H. Koch Institute for Integrative Cancer Research, which now includes a mix of biologists, who are trying to unravel what goes wrong inside cancer cells, and engineers, who are working on turning basic science discoveries into real-world treatments and diagnostics for cancer patients.

That “convergence” of life sciences and engineering is key to making progress in the fight against cancer, said Institute Professor Phillip Sharp, a member of the Koch Institute. “We need that convergence because we are facing a major demographic challenge in cancer as well as a number of other chronic diseases” that typically affect older people, such as Alzheimer’s, Sharp said.

In opening the symposium, MIT President Susan Hockfield said that MIT has “the right team, in the right place, at the right moment in history” to help defeat cancer.

“It’s in the DNA of MIT to solve problems,” said Tyler Jacks, director of the Koch Institute. “I’m very optimistic and very encouraged about what this generation of cancer researchers at MIT will do to overcome this most challenging problem.”

Past and present

In the past few decades, a great deal of progress has been made in understanding cancer, said Nancy Hopkins, the Amgen, Inc. Professor of Biology and Koch Institute member, who spoke as part of the first panel discussion, on major milestones in cancer research.

In the early 1970s, before President Richard Nixon declared the “War on Cancer,” “we really knew nothing about human cells and what controls their division,” Hopkins recalled. Critical discoveries by molecular biologists, including MIT’s Robert Weinberg, revealed that cancer is usually caused by genetic mutations within cells.

The discovery of those potentially cancerous genes, including HER2 (often mutated in breast cancer), has lead to the development of new drugs that cause fewer side effects in healthy cells. While that is a major success story, many other significant discoveries have failed to make an impact in patient treatment, Hopkins said.

“The discoveries we have made are not being exploited as effectively as they could be,” Hopkins said. “That’s where we need the engineers. They’re problem-solvers.”

Institute Professor Robert Langer described his experiences as one of the rare engineers to pursue a career in biomedical research during the 1970s. After he finished his doctoral degree in chemical engineering in 1974, “I got four job offers from Exxon alone,” plus offers from several other oil companies. But Langer had decided he wanted to do something that would more directly help people, and ended up getting a postdoctoral position in the lab of Judah Folkman, the scientist who pioneered the idea of killing tumors by cutting off their blood supplies.

In Folkman’s lab, Langer started working on drug-delivering particles made from polymers, which are now widely used to deliver drugs in a controlled fashion.

Langer and other engineers in the Koch Institute are now working on ways to create even better drug-delivery particles. Sangeeta Bhatia, the Wilson Professor of Health Sciences and Technology and Electrical Engineering and Computer Science, described an ongoing project in her lab to create iron oxide nanoparticles that can be tagged with small protein fragments that bind specifically to tumor cells. Such particles could help overcome one major drawback to most chemotherapy: Only about 1 percent of the drug administered reaches the tumor.

“If we could simply take these poisonous drugs more directly to the tumors, it would increase their effectiveness and decrease side effects,” Bhatia said.

Other Koch engineers are working on new imaging agents, tiny implantable sensors, cancer vaccines and computational modeling of cancer cells, among other projects.

Personalized medicine


Many of the targeted drugs now in use came about through serendipitous discoveries, said Daniel Haber, director of the Massachusetts General Hospital Cancer Center, during a panel on personalized cancer care. Now, he said, a more systematic approach is needed. He described a new effort underway at MGH to test potential drugs on 1,000 different tumor cell lines, to find out which tumor types respond best to each drug.

At MIT, Koch Institute members Michael Hemann and Michael Yaffe have shown that patient response to cancer drugs that damage DNA can be predicted by testing for the status of two genes — p53, a tumor suppressor, and ATM, a gene that helps regulate p53.

Their research suggests that such drugs should be used only in patients whose tumors have mutations in both genes or neither gene — a finding that underscores the importance of understanding the genetic makeup of patients’ tumors before beginning treatment. It also suggests that current drugs could be made much more effective by combining them in the right ways.

“The therapies of the future may not be new therapies,” Hemann said. “They may be existing therapies used significantly better.”

The sequencing of the human genome should also help achieve the goal of personalized cancer treatment, said Eric Lander, director of the Broad Institute and co-chair of the President’s Council of Advisors on Science and Technology, who spoke during a panel on biology, technology and medical applications. Already, the sequencing of the human genome has allowed researchers to discover far more cancer-causing genes. In 2000, before the sequence was completed, scientists knew of about 80 genes that could cause solid tumors, but by 2010, 240 were known.

Building on the human genome project, the National Cancer Institute has launched the Cancer Genome Atlas Project, which is sequencing the genomes of thousands of human tumors, comparing them to each other and to non-cancerous genomes. “By looking at many tumors at one time, you can begin to pick out common patterns,” Lander said.

He envisions that once cancer scientists have a more complete understanding of which genes can cause cancer, and the functions of those genes, patient treatment will become much more effective. “Doctors of the future will be able to pick out drugs based on that information,” he said.

Friday, 18 March 2011

Is government ready for the semantic Web?

So far it's been slow going, but an interagency XML project could boost law enforcement, health care efforts
When IBM’s Watson recently trounced the two most successful "Jeopardy!" players of all time, the supercomputer was relying in part on an emerging field of computerized language processing known as semantic technology.
In addition to being able to work out the answer to questions for Watson such as what fruit trees provide flavor to Sakura cheese, semantic technology is capable of providing answers to questions that might interest government agencies and other groups that historically have had problems identifying patterns or probable sequences in oceans of data. 
The idea is to help machines understand the context of a piece of information and how it relates to other bits of content. As such, it has the potential to improve search engines and enable computer systems to more readily exchange data in ways that could be useful to agencies involved in a wide range of pursuits, including homeland security and health care.
While semantic technology has mostly been an academic exercise in recent years, it is now finding a greater role in a practical-minded government project called the National Information Exchange Model (NIEM).

NIEM pursues intergovernment information exchange standards with the goal of helping agencies more readily circulate suspicious activity reports or issue Amber Alerts, for example. The goal is to create bridges, or exchanges, between otherwise isolated applications and data stores.
The building of those exchanges calls for a common understanding of the data changing hands. The richer detail of semantic descriptions makes for more precise matches when systems seek to consume data from other systems. Agreement on semantics also promotes reuse; common definitions let agencies recycle exchanges.
Semantics in government IT
Today, NIEM offers a degree of semantic support. But some observers believe the interoperability effort will take a deeper dive into semantic technology. They view NIEM as a vehicle that could potentially make semantics a mainstream component of government IT.
“Semantically, there is a huge opportunity with NIEM,” said Peter Doolan, vice president and chief technology officer at Oracle Public Sector, which is working on tools for NIEM. “NIEM is a forcing function for the broader adoption of the deeper semantic technology that we have talked about for some time.”

As more agencies adopt NIEM, the impetus for incorporating semantics will grow. NIEM launched in 2005 with the Justice and Homeland Security departments as the principal backers. Last year, the Health and Human Services Department joined Justice and DHS as co-partners. State and local governments, particularly in law enforcement, have taken to NIEM as well. And in a move that underscores that trend, the National Association of State Chief Information Officers last month joined the NIEM executive steering committee.
“NIEM adoption is going at a furious pace,” said Mark Soley, chairman and CEO of the Object Management Group (OMG), which has been working with NIEM. “As it gets adoption, they are going to need a way to translate information that is currently in other formats. That is when you need semantic descriptions.”
NIEM’s leadership says the program is prepared for greater use of semantics. “The NIEM program stands ready to respond to the overall NIEM community regarding a broader adoption of semantic technologies,” said DHS officials who responded to questions via e-mail.
Support for semantics
NIEM is based on XML. The project grew out of the Global Justice XML Data Model (GJXDM), a guide for information exchange in the justice and public safety sectors. Although XML serves as a foundational technology for data interoperability, it is not necessarily viewed as semantic.
However, John Wandelt, principal research scientist at the Georgia Tech Research Institute (GTRI) and division chief of that organization’s Information Exchange and Architecture Division, said semantic capability has been part of NIEM since its inception. GTRI serves as the technical architect and lead developer for GJXDM and NIEM.
“From the very early days, the community has pushed for strong semantics,” he said. Wandelt pointed to XML schema, which describes the data to be shared in an exchange. “Some say schema doesn’t carry semantics,” he said. "But the way we do XML schema in NIEM, it does carry semantics.”
NIEM’s Naming and Design Rules help programmers layer an “incremental set of semantics on top of base XML,” Wandelt said. For example, a group of XML programmers tasked to build a data model of their family trees would depict relationships between parents, siblings, and grandparents. But those ties would be implied and based entirely on an individual programmer’s way of modeling.
NIEM’s design rules, on the other hand, provide a consistent set of instructions for describing connections among entities. Wandelt said those roles make relationships explicit, thereby boosting semantic understanding.
NIEM also uses Resource Description Framework (RDF), an important underpinning of the Semantic Web, which has been slowly making its way into government IT (see sidebar).
RDF aims to describe data in a way the helps machines better understand relationships.

see the full article here: http://gcn.com/articles/2011/03/21/niem-and-semantic-web.aspx

Monday, 14 March 2011

Collective Intelligence Outsmarts Artificial Intelligence

When computers first started  to infringe on everyday life, science fiction authors and society in general had high expectations for "intelligent" systems. Isaac Asimov's "I, Robot" series from the 1940s portrayed robots with completely human intelligence and personality, and, in the 1968 movie "2001: A Space Odyssey," the onboard computer HAL (Heuristically programmed ALgorithmic computer) had a sufficiently human personality to suffer a paranoid break and attempt to murder the crew!
While the computer revolution has generally outstripped almost all expectations for the role of computers in society, in the area of artificial intelligence (AI), the predictions have, in fact, outstripped our achievements. Attempts to build truly intelligent systems have been generally disappointing.
Fully replicating human intelligence would require a comprehensive theory of consciousness which we unfortunately lack. Therefore, AI has generally attempted to focus on simulating intelligent behavior, rather than intelligence itself. In the algorithmic approach, programmers labor to construct sophisticated programs that emulate a specific intelligent behavior, such as voice recognition. In the other traditional approach - expert systems - a database of facts is collected, and logical routines applied to perform analysis and deduction. Expert systems have had some success in medical and other diagnostic applications, such as systems performance management.
Each of these approaches has shown success in limited scenarios, but neither achieves the sort of broadly intelligent system promised in the early days of computing. Attempts to emulate more human-like cognitive or learning systems-using technologies such as the neural nets, fuzzy logic, and genetic algorithms-have only slightly improved the intelligence of everyday software applications.
Most of us experience the limitations of artificial intelligence every day. Spell-checkers in applications such as Microsoft Word do an amazingly poor job of applying context to language correction. As a result, sentences such as, "Eye have a spelling checker, it came with my pea sea," pass through the Microsoft spelling and grammar checker without a hitch. While the Microsoft software can recognize spelling mistakes in individual words, it cannot understand the meaning of the sentence as a whole, and the result is a long way from intelligent judgment. 
Collective intelligence offers a powerful alternative to traditional artificial intelligence paradigms. Collective intelligence leverages the inputs of large numbers of individuals to create solutions that traditional approaches cannot achieve. Although the term "collective intelligence" is not widely recognized, most of us experience the results of collective intelligence every day. For instance, Google uses collective intelligence when auto-correcting search inputs. Google has a large enough database of search terms to be able to automatically detect when you make an error and correct that error on-the-fly.  Consequently, Google is more than able to determine that "pea sea" is almost certainly meant to be "PC."
Collective intelligence not only allows for superior spelling and grammar correction, but also is used in an increasingly wide variety of contexts, including spam detection, diagnostic systems, retail recommendations, predictive analytics, and many other fields. Increasingly, organizations find that it is more effective to apply brute force algorithms to masses of data generated by thousands of users, than to attempt to explicitly create sophisticated algorithmic models. 
The ability of collective intelligence to solve otherwise intractable business and scientific problems is one of the driving forces behind the "big data" evolution. Organizations are increasingly realizing that the key to better decision making is not better programs but granular crowd-sourced data sets.
Collective intelligence is merely one of the techniques used to endow computer systems with more apparent intelligence and to better solve real world problems - it's not in any way a replacement for the human brain. However, in an increasingly wide range of applications, collective intelligence is clearly outsmarting traditional artificial intelligence approaches.

Sunday, 13 March 2011

Artificial intelligence has just got smarter

Rajeev Srinivasan
The American TV quiz showJeopardy! has been running for over 25 years. Contestants are given clues in categories ranging from serious subjects such as World War II, to more frivolous topics like rock musicians. They then have to come up with a question in the format: “Who is…”, or “what is…” based on the clues. The clues are not straightforward and factual — a computer with a large database can crack such statements quickly — but oblique. They are full of puns, obscure relationships, jokes, allusions and so on that only a human being steeped in that culture will recognise. In that sense, the clues are not ‘context-free’ as computer languages are (or for that matter, classical Paninian Sanskrit): you must know quite a bit of cultural context to decode them. This is infernally hard for computers, and a challenge that artificial intelligence (AI) researchers have been struggling with for decades — the holy grail of ‘natural language processing’. There have been several false starts in AI, and enthusiasm has waxed and waned, but the iconic promise of computers that can converse (such as the talking computer HAL in 2001: A Space Odyssey) has remained elusive. This is why it is exciting news that a new IBM program (dubbed ‘Watson’ after the founder of the company), built specifically to play Jeopardy, defeated two of the world’s best human players in a special edition of the show on February 16th. There was some quiet satisfaction among the techie crowd that the day may yet arrive when intelligent robots can respond to conversational queries. Watson runs on a cluster of ninety Linux-based IBM servers, and has the horsepower to process 500 gigabytes of data (the equivalent of a million books) per second — which is necessary to arrive at an answer in no more than 3 seconds; that is the time human champions need to press the buzzer that would give them the right to answer the question. Ray Kurzweil, an AI pioneer and futurist, suggests this level of computing power will be available in a desktop PC in about a decade. Watson’s accomplishments are qualitatively different from those of its predecessor, Deep Blue, which defeated world chess champion Garry Kasparov in 1977. In many ways, chess, with its precise rules, is much easier for computers than the loose and unstructured Jeopardy! game. Thus, Watson is much more complex than Deep Blue, which stored the standard chess openings, and did a brute-force analysis of every possible outcome a few moves into the future. The interesting question though, is, what does all this mean for humans? The nightmare possibility is that we have reached that tipping point where humans will become redundant. That of course was the precise problem that 2001: A Space Odyssey’s HAL had - it felt the humans on board its spaceship were likely to cause the mission to fail; therefore it methodically set about eliminating them. Much the same dystopic vision haunts us in other science-fiction films: for instance the omniscient Skynet in The Terminator series or the maya-sustaining machines in The Matrix. Berkeley philosopher John Searle, writing in the Wall Street Journal, gives us some comfort. According to him, Watson is merely a symbol-manipulating engine, and it does not have superior intelligence; nor is it ‘thinking’. It merely crunches symbols, i.e. syntax, with no concept of meaning, i.e. semantics. “Symbols are not meanings,” he concludes, “Watson did not understand the questions, or its answers… nor that it won — because it doesn’t understand anything.” Even without becoming our overlords, Watson and its descendents may cause displacement. They will cause a number of jobs to disappear, just as voice recognition is affecting the transcription industry. Former head-fund manager Andy Kessler suggests in the WSJ that there are several types of workers, but basically ‘creators’ and ‘servers’; only the former are safe. Technology such as Watson will, he says, not only disrupt retail workers (eg. travel agents), bureaucrats, stockbrokers and customer support staff, but also legal and medical professionals. The latter may find applications like a doctor’s or lawyer’s assistant increasingly cutting into their job content. Thus the arrival of Watson-like artificial intelligences may cause serious disruption in the workforce, although it is not likely that they will be ordering us around any day soon. At least not yet. Humanity may be more resilient than we thought.

Tuesday, 8 March 2011


Artificial intelligence has taken a big leap forward: two roboticists (Lipson and Zagal), working at the University of Chile, Santiago, have created what they claim is the first robot to possess “metacognition” — a form of self-awareness which involves the ability to observe ones’ own thought processes and thus alter one’s behavior accordingly.
The starfish-like robot (which has but four legs) accomplished this mind-like feat by first possessing two brains, similar to how humans possess two brain hemispheres (left and right*). This provided the key to the automaton’s adaptability within a dynamic, and unpredictable, environment.
The double bot brain was engineered such that one ‘controller’ (i.e., one brain) was “rewarded” for pursuing blue dots of  light moving in random circular patterns, and avoiding running into moving red dots. The second brain, meanwhile, modeled how well the first brain did in achieving its goal.
But then, to determine if the bot had adaptive self-awareness, the researchers reversed the rules (red dots pursued, blue dots avoided) of the first brain’s mission. The second brain was able to adapt to this change by filtering sensory data to make red dots seem blue and blue dots seem red; the robot, in effect, reflected on its own “thoughts” about the world and modified its behavior (in the second brain), fairly rapidly, to reflect the new reality.
This achievement represents a significant advancement over earlier successes with AI machines in which a robot was able to model its own body plan and movements in its computer brain, make “guesses” as to which of its randomly selected body-plan models was responsible for the correct behavior (movement), and then eliminate all the unsuccessful models, thus exhibiting an “analogue” form of natural selection (see Bongard, Zykov, Lipson, 2006). **

TOPIO, a humanoid robot, played ping pong at Tokyo International Robot Exhibition (IREX) 2009.
The team is already moving beyond this apparent meta-cognition stage and is attempting to enabled a robot to develop what’s known as a ‘theory of mind’ – the ability to “know” and predict what another person (or robot) is thinking. In an early experiment, the team had one robot observe another robot moving in a semi-erratic manner (in a spiral pattern) in the direction of a light source. After a short while, the observer bot was able to predict the other’s movement so well that it was able to “lay a trap” for it.
Lipson believes this to be a form of “mind reading”. However, a critic might argue that this is more movement reading, than mind, and that it remains to be proven that the observer bot has any understanding of the other’s “mind”. A behavior (such as the second bot trapping the first) might simulate some form of awareness of another’s thought process, but can we say for sure that this is what is really happening?
One idea that might lend credence to this claim is if the observer bot had a language capacity that allowed it to express its awareness, or ‘theory of mind’. Nearly two decades ago, pioneering cognitive biologists Maturana and Varela posited “Language is the sin qua non of that experience called mind.”
And, achieving such a “languaging” capacity in not out of the question; a few years ago, a team of European roboticists created a community of robots that not only learned language, but soon learned to invent new words and to share these new words with the other robots in the community (see: Luc Steels, of the University of Brussels/SONY Computer Science Laboratory in Paris).
It is conceivable that a similarly equipped robot — also possessing the two-brain structure of Lipson’s robots — could observe itself thinking about thinking, and express this awareness through its own (meta) language. Hopefully, we will be able to understand what it is trying to express when and if it does.

A Pick and Place robot in a factory. You've come a long way droidy.
In a recent SciAm article on this topic, Lipson stated:
“Our holy grail is to give machines the same kind of self-awareness capabilities that humans have”
One other question that remains, then: Will the robot develop a more complex simulation/awareness of itself, and the world, as it learns and interacts with the world, as we do?
The four-legged, robot also exhibited another curious behavior: when one of its legs was removed (so that it had to relearn to walk) , it seemed to show signs of what is known as phantom limb syndrome, the sensation ta one still has a limb though it is in fact missing (this is common in people who have lost limbs in war or accidents). In humans, this syndrome represent a form of mental aberration or neurosis (perhaps even an hallucination). A robot acting in this way — holding a false notion of itself — may give scientists and AI engineers a glimpse into robot mental illness.
A robot with a mental illness or neurosis? Yes, this seem entirely likely given the following three theorems:
1] Neurosis is accompanied (and is perhaps a function of) acute self-awareness; the more self-aware, the more potentially neurotic one becomes.
2} Robots with advanced heuristics (enabled by multiple brains, self-simulators and sensor inputs) will inevitably develop advanced self-awareness, thus the greater potential for 1] above.
3] There is an ancient, magickal maxim: Like begets like. The creator is in the created (in Biblical terms: “God made man in his own image.”

What would Freud say about this form of attachment?
Mayhaps the ‘Age of Spiritual Machines‘ could become an ‘Age of Neurotic Machines‘ (or Psychotic Machines, depending on your view of humans), too. So then, f this is be the fate of  I, Robot, let’s do our droid druggs a favor and engineer a robo-shrink, or, at least, a good self-help program…and a love for Beethoven.

Monday, 7 March 2011

Managing Free Text Archives with Linguistic Semantics

Semantic natural language processing interprets the meaning of free text and enables users to find, mine and organize very large archives quickly and effectively. Linguistic semantic processing finds all and only the desired information because it determines meaning in context and maps synonym and hyponym relationships. It avoids assigning incorrect relationships because meaning is precisely determined. At the same time it makes all the desired connections exhaustively because it is backed by a massive lexicon and semantic map. The key scalably of Cognition's linguistic semantic processing is bottom-up interpretation of the text, finding the meaning of words and phrases in the local context one at a time. The technology has a semantic map and algorithms that interpret language linguistically rather statistically, so that the meaning of a given document is independently determined. As a result the methods scale to a theoretically unlimited number of documents. Linguistic semantic NLP is being deployed in many applications that facilitate rapid and accurate management of very large archives. . 1. Free auto categorization - Texts are categorized into an existing ontology or a special client-defined ontology according to the salient concepts in them. 2. Segregation by genre - The software determines which of a predetermined set of genres a documents falls into. In the legal domain, the genre set might be "contracts" and within "contracts", "employment contract", "services contract", etc., PPM, "pricing proposal", "mortgage agreement", and so on.
3. Conceptual foldering (or tagging) - Documents are placed in conceptual folders using conceptual Boolean expressions that cover all of the topics desired for the folders. This is especially useful in e-Discovery, where documents can be culled leaving only the relevant portion to be reviewed.
4. Intelligent search - The semantic search function retrieves almost all and only the desired documents. Very high precision is achieved by disambiguating words in context, and by phrasal reasoning. Very high recall is achieved by paraphrase and ontological reasoning.
5. Text Analytics - Calculating the frequency and salience of words, word senses, concepts and phrases in a document or document base lays bare its significant semantic content.
6. Sentiment analysis - With semantic processing, sentiments can be determined. Existing lexical resources identify the "pejorative" and "negative" words.
7. Language monitoring - In some situations such as child chat or email, certain types of language may need to be blocked. Linguistic semantic processing detects undesirable (or desirable) language as defined by administrators.

Kathleen Dahlgren has a Ph.D. in Linguistics and a Post-Doc in Computer Science from UCLA. She has worked and contributed publications in computational linguistics for over 20 years. Her publications cover topics in sense disambiguation, question-answering, relevance, coherence and anaphora resolution. Her book, Naive Semantics for Natural Language Understanding primarily treats a method for representing commonsense knowledge and lexical knowledge, and how this can be used in sense disambiguation and discourse reasoning. The software offered at Cognition Technologies is patented by Kathleen Dahlgren and Edward P. Stabler, Jr., and has been under development for a number of years, so that it now has a wide coverage semantic map of English.

Sunday, 6 March 2011

Reflections on Watson the Computer


By Sally Blount / Kellogg School of Management

The gap between human and artificial intelligence seems to be getting smaller... on Feb. 16, IBM's “Watson” computer outsmarted two Jeopardy champions.
A recent edition of TIME magazine explored our quest for human perfection and the rapidly emerging human-technology interface. And the current issue of Atlantic magazine reports the ever-closer results of the Turing Test—which determines whether a human or computer program can hold the most human-like conversation for five minutes.

As I read about these technological advancements, I can't help thinking that, if given a chance, I would love to have a chip planted in my brain that would help me remember names. I meet so many people every day from across our 60,000-person community of students, administrators, faculty, alumni and corporate partners. I would feel so much better and be more effective if, with a little help from technology, I could remember everybody's names every time I saw them.
But then I begin to wonder: With that chip implanted, would I become progressively worse at naturally remembering names? I'm not sure I like that idea. . . and then I can't help but think, what is being human about, anyway? Is it really about each of us trying to become more perfect,each in our own way, or is there some broader, less individually-focused aim?
Once we create computers and performance-enhanced humans that can outperform real humans (by 2045, as TIME predicts), will we have found jobs and eradicated poverty for the billion-plus among us who live on less than $2 a day? Will we have the infrastructure in place to provide every human on the planet with access to clean water and a warm bed? Will we have found deterrents to dramatically reduce, if not halt,the black market for sex trafficking? If the answer to these questions is “yes,” then these technological advancements will be of true value to humanity. But I have a terrible feeling that in 2045 the answers will still be a resounding “no.”
That's because there are some human limitations that technology is far from being equipped to fix. It can't overcome limitations that we ourselves don't know how to solve. One of our most glaring challenges is our collective inability to build effective organizations—organizations that consistently and reliably perform in a way that exemplifies the best of human performance and values. Each day's news reinforces this truth—in the Middle East, Washington,Mexico, and in corporate, government and religious headquarters around the world—as startling and saddening revelations emerge about flawed and corrupt organizations.
If we really want to change the world, we need to put more resources into studying and enhancing our shared human capabilities at building organizations—be they firms, government agencies or NGOs. There are many pressing questions:What are the barriers that deter us? Can we develop and use technology in ways that can counter these barriers? What political and social infrastructure do we need to support organization building? What individual-level skills are needed to equip organization-builders and change agents in established bureaucracies? How does leadership rhetoric help us on this road?
Until we become as good at building — and sustaining — effective organizations as we are good at computer programming, we will never realize our full human potential.

Semantic Technologies Bear Fruit In Spite of Development Challenges

In a conversation with BioInform, Ted Slater, head of knowledge management services at Merck and the CSHALS conference chair, described this year's meeting as "the strongest program" in the four years of its existence.
"Four years ago ... nobody [really] knew about [semantics]," Slater said. "Now we are at the point where we're talking about ... expanding the scope a little bit [and asking,] 'What else can we add into the mix to make it a more complete picture?'"
This year's conference began with a series of hands-on tutorials coordinated by Joanne Luciano, a research associate professor at Rensselaer Polytechnic Institute, that were intended to show how the technology can be used to address drug development needs.
During the tutorials, participants used semantic web tools to create mashups using data from the Linked Open Data cloud and semantic data that they created from raw datasets. Participants were shown how to load data into the subject-predicate-object data structure dubbed the "triple store;" query it using the semantic query language SPARQL; use inference to expand experimental knowledge; and build dynamic visualizations from their results.
Luciano told BioInform that this was the first year that CSHALS offered practical tutorials and the response from participants was mostly positive. Furthermore, the tutorials were made available for users in the RDF format so that “we were in real time, during the tutorial, able to run parallel tracks to meet all the needs of the tutorial participants,” she said.
While it's clear to proponents that semantic technology adds value to data, several speakers at the conference indicated that there is room for improvement and that much of the community remains unaware of the advantages that the semantic web offers.
For example, Lawrence Hunter, director of the computational bioscience program and the Center for Computational Pharmacology at the University of Colorado, pointed out that the field is still lacking good approaches to enable "reasoning" or, in other words, to figure out how "formal representations of data can get us places that simple search and retrieval wouldn’t have gotten us."
During his presentation, John Madden, an associate professor of Pathology at Duke University, highlighted several factors that need to be considered in efforts to "render" information contained in medical documents, such as laboratory reports, physician's progress notes, admission summaries, in the RDF format.
A major challenge for these efforts, he said, is that these documents contain a lot of "non-explicit information" that’s difficult to capture in RDF such as background medical domain knowledge; the purpose of the medical document and the intent of the author; "hedges and uncertainty"; and anaphoric references, which he defined as "candidate triples where it's unclear what the subject is."
Yet despite its complexities, many researchers are finding useful applications for the technology. For example, Christopher Baker of the University of New Brunswick described a prototype of a semantic framework for automated classification and annotation of lipids.
The framework is comprised of an ontology developed in OWL-DL that uses structural features of small molecules to describe lipid classes; and two federated semantic web services deployed within the SADI framework, one of which identifies relevant chemical "subgraphs" and a second that “assigns chemical entities to appropriate ontology classes.”
Other talks from academic research groups described an open source software package based on Drupal that can be used to build semantic repositories of genomics experiments and a semantics-enabled framework that would keep doctors abreast of new research developments.
Creating Uniformity
Semantic technologies are also finding their way into industry. Sherri Matis-Mitchell, principal informatics scientist at AstraZeneca, described the first version of the firm’s knowledgebase, called PharmaConnect, which was released last October and integrates internal and external data to provide connections between targets, pathways, compounds, and diseases.
Matis-Mitchell explained that the tool allows users to conduct queries across multiple information sources "using unified concepts and vocabularies." She said that the idea behind adopting semantic technologies at AstraZeneca was to shorten the drug discovery timeframe by bringing in "knowledge to support decision-making" earlier on in the development process.
The knowledgebase is built on a system called Cortex and receives data from four workstreams. The first is chemistry intelligence, which supports specific business questions and can be used to create queries for compound names and structures. The second is competitive intelligence, which provides information about competing firms' drug-development efforts, while the final two streams are disease intelligence, used to assess drug targets; and drug safety intelligence.
In a separate presentation, Therese Vachon, head of the text mining services group at the Novartis Institutes for Biomedical Research, described the process of developing a federated layer to connect information stored in multiple data silos based on "controlled terminologies" that provide "uniform wording within and across data repositories."
Is the Tide Turning?
At last year's CSHALS, there was some suggestion that pharma's adoption of semantic methods was facing the roadblocks of tightening budgets, workforce cuts, and skepticism about the return on investment for these technologies (BI 03/05/2010)
Matis-Mitchell noted in an email to BioInform that generally new technologies take time to become widely accepted and that knowledge engineering and semantic technologies are no different.
She said her team overcomes this reluctance by regularly publishing its "successes to engender greater adoption of the tools and methods." While she could not provide additional details about these successes in the case of PharmaConnect for proprietaty reasons, she noted that the "main theme" is that it "helped to save time and resources and supported more efficient decision making."
However some vendors now feel that drug developers may be willing to give semantic tools a shot and are gearing up to provide products that support the technology.
In one presentation, Dexter Pratt, vice president of innovation and knowledge at Selventa, presented the company's Biological Expression Language, or BEL, a knowledge representation language that represents scientific findings as causal relationships that can be annotated with information about biological context, experimental methods, literature sources, and the curation process.
Pratt said that Selventa plans to release BEL as an open source language in the third quarter of this year and that it will be firm's first offering for the community.
Following his presentation, Pratt told BioInform that offering the tool under an open source license is "consistent" with Selventa's revised strategy, announced last December, when it changed its name from Genstruct and decided to emphasize its role as a data analysis partner for drug developers (BI 12/03/2010).
To help achieve this vision Selventa "will make the BEL Framework available to the community to promote the publishing of biological knowledge in a form that is use-neutral, open, and computable" Pratt said .adding that the company's pharma partners have been "extremely supportive" of the move.
Although the language has already been implemented in the Genstruct Technology Platform for eight years, In preparation for it's official release in the open source space, Selventa's developers are working to develop a "new build" of the legacy infrastructure that's " formalized, revised, and streamlined."

Friday, 4 February 2011

New Life for Semantic Technologies

Cambridge Semantics provides flexible solutions for the data deluge.

By Kevin Davies
February 4, 2011 | A small software company formed by a group of former IBM staffers is breathing new life into semantic technologies. But don’t look for Cambridge Semantics to harp on the term.
“The world of people well versed in semantic technology is still quite small,” says co-founder Lee Feigenbaum. “It’s important that anyone working with our software should not be IT. You won’t see the word ‘semantics’ anywhere in our software. It’s an enabler for us. We can’t build our software without these technologies, but now we’ve built them, we’ve no interest in preaching that you’re using semantics.” (see, “Masters of the Semantic Web,” Bio•IT World, Oct 2005)
“We don’t lead with ‘Semantic Web’ as a marketing term,” adds senior product manager Rob Gonzalez. “We’d like to see more companies like us trying to solve real-world problems. For us it’s about the problems we’re solving.”
Along with CTO Sean Martin, Feigenbaum was one of a group of about 20 people in an advanced technology group at IBM dating back to 1995. The group’s mission was to research new Internet technologies (including semantic technologies) and potential applications for IBM. An early client was a group of cancer researchers at the Massachusetts General Hospital (the Center for the Development of a Virtual Tumor), for which the IBM team helped to deploy semantic technologies for building and sharing models, data, and literature.
In 2007, Martin and Feigenbaum, together with Simon Martin and Emmett Eldred, established Cambridge Semantics and spent a couple of years building up the engineering team and testing early products before launching its first commercial product in late 2009. Luckily, much of the IBM group’s technology was open source. “People have been [saying] that they can’t build libraries or services that are really reusable or discoverable. We think with semantics, you get these benefits,” says Feigenbaum.
Early customers include Johnson & Johnson, Merck, and Biogen Idec, although Cambridge Semantics’ client base includes Fortune 500 companies in advertizing and the oil industry. “This technology can be used in many industries, but is particularly geared toward life sciences,” says Gonzalez. “The data bonanza isn’t comparable to other industries. Life scientists simply need this flexibility.”
Semantic Sidestep
There’s a saying that Feigenbaum admits is neither new nor particularly funny, but it makes a point: If you put ten Semantic Web advocates in a room, you’ll get 15 different explanations of what the Semantic Web is. “You have a loosely coupled set of technologies that people can use for a million different things. People will latch onto something and say this is the real semantic technology.”
Indeed, Feigenbaum is blunt in his criticism of vendors and users alike who proclaim the magical properties of the Semantic Web. “I’ve seen pharma talk about semantics as the ultimate data integration/analysis tool. That’s all well and good and we might get there in the next 10-15 years, but it’s never been what we’ve seen in semantics.”
For Feigenbaum, the interesting bit of semantic technology is the notion of rebranding data in a flexible and agile way. “The underlying properties of semantic technologies let you build very agile, adaptive software systems as data sources changes. It happens in all industries but especially in life sciences.”
Semantics is about flexibility and having a common data model upon which one can take information from a variety of sources—XML, relational databases, or public clinical trial database—and “map them to a common format not constrained by any a priori database schema or XML structure. We saw this flexibility in 2001, and proved it out at IBM. That’s what we wanted to leverage.”
Cambridge Semantics released its first three products in 2009. “There’s no magic to the software,” says Feigenbaum. Just an easy-to-use interface and set of tools that allows users to point to a particular area in a spreadsheet, for example, and ascribe a meaning, e.g. adverse event, assay result. “You have these common vocabularies and data models, and the system takes care of finding values that match and links them together, without having necessarily considered that way of linking things when you set up the system.”
The Anzo Data Collaboration Server, which sits on the user’s server, is semantic middleware, the plumbing that runs and connects everything else. Says Feigenbaum: “It invokes Web services. It has data services and server services that let you build flexible applications.”
Anzo on the Web is a Web 2.0-style application for self-service reporting of any data connected to the data collaboration server. Typically, when users want to use a new data source, Gonzalez explains, they have to change the database, then the application code, then the web tier. “With Anzo on the Web, you can bring the new data source easily into the data collaboration server, and it propagates throughout the system without requiring a lot of manual changes, so it’s resilient to new types of information being added.” The application is designed for scientists who aren’t necessarily IT experts. “They don’t have to go to IT to build new views; they can do it,” says Feigenbaum.
Anzo for Excel is a plug-in to Microsoft Excel that lets people use spreadsheets more effectively. It makes the collection of ad hoc data trivial, says Feigenbaum. “It turns Excel into a data collection application and lets it serve as a user interface for all this data integrated on the server. Now you can consume the data.” A recently-released second version adds an unnamed component that allows users to collect and integrate data from relational databases.
The company announced in mid-January an agreement with Cray to collectively develop and market solutions, including the Cray XMT system and the Anzo product suite. But Feigenbaum is also using the Amazon Cloud, particularly with new prospects. “The data integration paradigm we’re preaching is anathema to a lot of traditional IT,” says Feigenbaum, particularly in regard to procuring hardware, which can sometimes take months. “Many customers run a proof-of-concept in the Cloud with hosted versions of the software. That lets them prove out the technology and work on the procurement to deploy inside their firewall.”
One of the chief benefits of Cambridge Semantics, says Feigenbaum, is that it affords pharma customers the ability not only to pull in and analyze the data from a traditional database but also “the last 10-15% of their data that might be lurking in a desktop spreadsheet or a public resource such as NCBI. They don’t want to spend millions of dollars and 18 months only to get 90% of the way. They need to handle the heterogeneity of Excel and public data. [The missing data] might only be a small part of the total information but it’s a deal breaker.”
Early users span applications from manufacturing quality control to budgeting, allowing customers such as Biogen Idec to compare their actual spend with budget projections. Merck is using Cambridge Semantics applications to procure time on lab equipment.
Cambridge Semantics is still learning from its early customers where its technology can be leveraged. One promising area is in clinical trial data management. Says Feigenbaum: “When you’ve brought together data that don’t normally talk to each other, there’s a bunch of things you can do, such as looking at data for a drug across trials/phases. But some [historical] trials might have used SAS or Oracle Clinical. This is a good way to bring data together,” perhaps to identify reporting discrepancies for regulatory purposes.
An alternative term for semantic technologies that is growing in popularity is “linked data.” “It’s fine,” shrugs Feigenbaum. “It’s just another name. It’s had some success in life sciences, but I don’t care what it’s called.”

Thursday, 3 February 2011

An Immortal Lesson in Design

The Mac's Inventor's Deathbed Gift: An Immortal Lesson in Design For His Son

The man who created the Mac interface gives his son Aza Raskin a final gift that that testifies to the beauty and power of simplicity.


Twenty five days before my father Jef died, on my birthday exactly six years ago, he gave me a present. He had the sparkle back in his eye -- the one that had been reduced by pancreatic cancer to an ashen ember -- when he gave it to me. It was a small package, rectangular in shape, in crisp brown-paper wrapping. Twine neatly wrapped around the corners, crisscrossing back and forth arriving at a bow crafted by the sure hands of a man who built his first model airplane at age seven.
This small brown package was to be the final gift my father ever gave me.
My family does gifts strangely. For instance, we have our own mangled interpretation of Hanukkah, where each person of the family has a night to give out presents. If we have five people home for Hanukkah, we celebrate only five of the eight nights. The joy of gifts are in the giving, not receiving, so before opening your present you must first guess what’s inside. This tradition is "plenty questions," a more forgiving version than the standard twenty questions.
“Animal, vegetable, or mineral?”, I ask.
I stare at the package. In it is my father. The man who invented the Mac.
We are in it for the game of teasing the gift out of the gifter. It's like extracting a ball of yarn from a kitten. The tugs, pulls, and misdirections are the fun. The question must answerable by a simple "yes" or "no." Naturally, the later into the questions we get, the more liberal this rule becomes. We don't break the rule exactly, but answers become a series of "not-exactly"s and "yes-but"s. In past years, the givers have often spent hours creating elaborate disguises for the gifts. I've shaped styrofoam into a fantastic reptilian shape to disguise a pair of earrings for my mother. She guessed them perfectly anyway. There may be collusion going on.
"Mineral," my father says.
We often waste questions on silly asides. We ask about refrigerators and ostrich eggs when the gift is clearly book shaped. But my father is sick. Where there was once the thought that a cure might be found, only fleeting misplaced hope remains like a high school summer fling dissipating in the face of college. We know there isn't much time. Still I ask.
I stare at the package in my hands. In it is my father. The man who invented the Macintosh and misnamed what should be "typefaces" as the "fonts" menu. He never forgave himself for his incorrect usage of English. He groomed me to use language exactingly and considered that mistake a failure of being young and reckless with semantics. The man who invented click-and-drag was now the man who could hardly keep his gaze focused on his son. The box is, of course, smaller than a bread box. It's a question we always ask. My family smiles only out of habit.
"No," my father says. A long pause. "No," he says again, "it is smaller than a bread box. Smaller and sharper." He speeds the guessing game along. Time.
The gift was a message about an entire way of thought.
"Sharper?" I ask. A knife? The box is too small for a typical kitchen knife. It could be a Swiss Army knife. Jef always carries one. The big blade is for food, the little blade for everything else. He gets a bit indignant if you borrow it and use the wrong blade. I have a Swiss Army knife, but I haven't carried it since airport security theater ramped up after 9/11. It probably isn't a knife. Maybe a razor? One can't just ask outright, that doesn't give enough information when you are wrong. Something sharp could be many things. Seeking something more strategic I ask, "Can it be found in a bathroom?"
Long pause.
"Yes."
Three days before he passed, Jef had an accident. He needed to use the restroom, so -- stooped under his arm -- I supported his weight as he hobble to his business. There was something quietly unsettling about escorting my father to a toilet that had been taller than me when we first moved into the house twenty years earlier. I sat him down, walked out, and closed the door. Moments later, a crash jolted the house. I slammed the door open. The metallic smell of water fresh from a pipe whipped my nose and water flooded the floor. The toilet was dislocated from its base like an arm from its socket, and lodged between the toilet and the wall was my father. Despite his size, he looked small and meager. He stared up at me with eyes full of innocent surprise. Why am I on the floor, they asked? Why am I wet? The shocked curiosity in his wide-open eyes is the single most haunting image I have of my father. In the dark space between closing my eyes and falling asleep, that image sometimes steals in and taints me. When it does, there is no help for it. I have to get out of bed and go for a run. Otherwise, sleep will be overshadowed by those confused, guileless eyes.
"It must be a razor?" I ask. He nods his assent with a satisfied smile. He gestures for me to open it. Carefully undoing the knot, the twine, and the paper reveals a cardboard box on which he has written "For Pogonotomy." Of course there is a word for beard trimming, and of course my father knows and uses it. In high school, I played a trick on my teachers: in every essay I used my own made-up word. I used "indelic" to mean something between "endemic" and "inextricably entwined." No matter how many times I trotted it out, not one of my teachers caught me. I used it once in passing with my father and he immediately but gently pointed it out as a non-word. Some men spend time meticulously trimming their beard. My father trimmed his vocabulary. Language is communication, and my father was fastidious about it. Often when we got into particularly deep conversations, he'd pause and continue the rest of the discussion in written form where he could distill his thoughts into a sharp crystalline relief.
The razor itself was a vintage safety razor. Looking at it, I understood the allure. It is an inventive and simple design. The razor takes a flat blade and arches it under a metal shield, giving the blade both greater mechanical strength as well as a protective sheath that keeps you safe. It's the kind of clear insight for which all designers and inventors strive: beauty in turning constraints into advantages.
That razor is a message, rendered in steel and wood, about an incorporeal way of thought. That was my father's final gift to me: A way of looking at the world through the lens of playful questioning, which reveals more than just an answer.Twenty five days later, the razor remained but my father did not.
Jef, I miss you.