Automatic Text Categorization
In the International Patent Classification (IPC), patents and utility models are classified based on a hierarchical system of language independent symbols. They are categorized based on their technology. Hence, IPC is of great use to retrieve patent documents. Categorizing the patents accurately in IPC is of utmost importance.
Artificial Intelligence has made the categorizing process easy and super-efficient. It helps in optimizing the search results to categorize patents in their respective category.
What is IPCCAT-neural?
In brief, IPCCAT is a tool to help in automatic text categorization in IPC. It classifies patents into their main and subgroup levels. This system uses neural network technology to classify patents automatically.
Neural networks contain processors working simultaneously in different layers. The output from a layer will act as the input for the consequent layer eventually. Hence, neural networks, as a part of Artificial Intelligence, can perform accurate data analysis and offer precise search results. They not only categorize the patents swiftly but also improve the consistency in patent classification.
Objective of IPCCAT-neural
The IPCCATās objective was to form a trained system based on neural networks to provide optimal predictions to classify a patent. The system aimed to incorporate maximum data from already classified patents in IPC.
For example, when the user gives a text input such as the patent abstract, the system will automatically predict the most appropriate IPC symbols under which the particular patent can be classified.
Developments in IPCCAT
In 2016, the system consisted of approximately 700 neural networks with 7,374 categories in the leading group level. On entering the text input, the first three guesses were 80% accurate.
There was a drastic growth in IPCCAT-neural network text categorization in 2018. There was the integration of IPCCAT-neural at the sub-group level into the IPCPUB v 7.5. by February. The system had around 8000 neural networks with 72,137 categories in the subgroup level. Patent experts stated that the chances of the top-3 predictions matching are 80% for the English language and 70% for French.
We have classified around 30 million excerpts of patent documents in the English language in 2019. The latest development in the IPCCAT-neural system allows us to categorize patents from different languages automatically. The input text can be in any of the following languages, namely, English, Chinese, French, Arabic, German, Japanese, Russian, Portuguese, Korean, and Spanish. The accuracy of predictions will be similar to the predictions when the input text is in the English language. There are 73,633 symbols stored in approximately 8000 neural networks. But, the system can offer 84% accurate results in the first three predictions irrespective of the language.
Artificial Intelligence has revolutionized the International Patent Classification system. Text categorization has never been easier. Classification of patents would have been a tedious and never-ending task if we did not apply the idea of neural networks to them. Fortunately, they have eased the work of categorizing patents and utility models and helped the sector in preventing backlogs. Patent Offices have saved loads of time and energy spent on categorizing the patents into thousands of groups available.
The accelerated development in Artificial Intelligence ensures a paradigm shift not only in patent classification but also in the regular operations of Intellectual property rights.