Re: KM: Organizing and Finding Content… Out of the box thinking #metadata
toggle quoted messageShow quoted text
Jeff, I like your approach for certain specific applications but don’t see it as a general solution to finding information inside the enterprise. The answer to all or most of the issues raised by you and Gavin is simply the smart use of text analytics – auto-categorization feeding aboutness, entity extraction feeding faceted metadata and other applications, and related functionality like sentiment analysis, auto-summarization, etc. For more see my book, Text Analytics.
As far as human vs. auto-categorization, the answer I’ve had the most success with is a hybrid model of auto-tagging combined with human review. The first step is getting very accurate auto-tagging rules. A human review of suggested tags is significantly easier and leads to more accurate tags than either automatic tags or human classification. I’ve been getting 95% accuracy for auto-tagging even without a human review with a new approach – a Mini-POC that builds auto-tagging rules in a week.
Chief Knowledge Architect
Author: Deep Text
KAPS Group, LLC
From: SIKM@groups.io [mailto:SIKM@groups.io] On Behalf Of Gavin Chait
Sent: Thursday, December 12, 2019 1:42 AM
Subject: Re: [SIKM] KM: Organizing and Finding Content… Out of the box thinking
I’ve been experimenting with this over the years and it’s definitely a serious challenge.
We tried a graph-based metadata search engine at GE Healthcare, but – while it tested well and people were enthusiastic about it – in practice it turned out to be too slow for the average person. Navigating/searching by traversing nodes in a graph is slow and assumes specialist knowledge which may not be true. For better or worse, the search engine model of querying is what people are familiar with even if alternatives will return better results.
We sit with specialist knowledge, so the means with which Google, Bing, etc index content does not work. They can rely on reinforcement from millions of page views to single out the most likely result without needing much knowledge of the content. Our specialist knowledge is used by specialists. The report a user is looking for may be critical for them, but never have been read before by anyone else.
We need natural-language “aboutness” algorithms to index our data. Sadly, keyword-itus has taken over report-writing, so classifiers end up over-classifying content.
Which leads back to human classification of content … which is not ideal.
Would be interesting to know what other solutions and ideas people have tried?
Gavin Chait is a data engineer and development economist at Whythawk.
[Edited Message Follows]
I have been working on this problem for over 25 years (Lotus Notes, Usenet). Working at a large company presents all the challenges: I don’t know where to look in constantly expanding network file folders, we can’t agree on global metadata, search returns too many irrelevant results…
The real problem isn’t organizing information, it is finding relevant information quickly. ”I can’t find anything” is the number one complaint no matter what technology you are using.
I do expert interviews in my consulting practice. One deliverable is a detailed set of notes organized by common topics the expert used frequently when describing the work. These topics became the metadata. I then worked with the expert to create a knowledge or concept map from these terms organized in the way the expert connected them in a mental model.
This model, with the associated expertise, was instantly recognized by colleagues and managers as a valuable learning and performance asset. Recently, I was able to incorporate the metadata model into a clickable page in SharePoint. You can now see all the metadata on one page, organized in the way an expert thinks and find answers in one click. Check out a demo of this at transferknowhow.com.