AI and Patent Analytics
AI in Action: Redefining Patent Processes.
In today’s rapidly evolving tech landscape, effective management of intellectual property (IP) is crucial for business success, with patents offering a competitive edge. However, with millions of patents filed each year, tracking them can feel overwhelming. This is where artificial intelligence (AI) steps in and revolutionizes patent management by streamlining searches and enhancing analysis. Advanced AI tools, for example Ambercite and Amplified AI, leverage technologies such as Deep Learning, Neural Networks, and Natural Language Processing to automate and refine patent searches. These tools not only suggest synonyms and analyze citations but also categorize patents, predict technology trends, assess risks, and create insightful patent landscape maps (Market Trends & Market Trends, 2021). While AI significantly boosts search speed and accuracy, challenges like algorithm biases and privacy concerns persist, highlighting the importance of human expertise in the process. Our discussion provides an overview of various AI methods focused on patent analysis from multiple sources.
Major Tasks in Patent Analysis
The patent application and approval process are complicated and involve many tasks for both applicants and patent examiners. AI tools can help make these tasks easier (Krestel et al., 2021). Recent studies in the patent domain explore a variety of innovative applications. For example, one study developed a model using deep learning methods like CNN and LSTM to predict whether a patent application will be granted and to classify reasons for rejections. Another research applied transformers and Graph Neural Networks (GNN) for patent classification and landscaping, while unsupervised methods using PCA and k-means identified correlations between patent codes and search keywords. Additionally, studies are focused on generating new ideas and assessing novelty, such as using BERT to understand the inventive process and developing explainable AI (XAI) for novelty analysis (Choi, 2022). Companies are increasingly leveraging AI in patent processes, with machine learning techniques appearing in 40% of AI-related patents and growing at an annual rate of 28% (Kelloway, 2024). Notable applications include automated patent drafting tools that use NLP to streamline tasks like claim renumbering and provide writing suggestions, as well as AI-powered tools for patent classification, searches, and market exploration (Giordano et al., 2023). These advancements are enhancing efficiency and decision-making in the patent landscape.
Here are the main areas where AI can be useful.
Patent Classification: Patent classification is a crucial yet time-consuming process in managing patents. It involves assigning multiple labels to patents using a hierarchical system, meaning a patent can belong to various categories. The two main classification systems are the International Patent Classification (IPC) and the Cooperative Patent Classification (CPC). The IPC has a complex structure with many sections and subcategories, while the CPC is an expanded version with around 250,000 entries. The classification process is challenging because each patent can have multiple codes, and patent documents contain different sections like titles, abstracts, and claims, making it hard to determine the most relevant parts for classification (AI-based Classification for IP Data, n.d.).
1. Patent Classification AI Methods
Most patent classification methods use a two-step approach: generating features and then applying a classifier. The neural networks, like LSTM and Bi-GRU, are mostly used techniques such as Word2Vec and fastText for IPC classification. Ensemble Models combine various deep learning models and word embeddings to enhance performance, including SVM and Bi-LSTM (Learning Patent Speak: Investigating Domain-Specific Word Embeddings, n.d.). Some methods also classify patent images using the CLIP model and CNNs . Large Language Models (LLMs), like BERT and XLNet, are being fine-tuned for patent classification, with studies demonstrating improved performance through specialized datasets. SciBERT, pre-trained on scientific literature, also outperforms BERT in understanding patent language (Shalaby et al., 2018). Evaluation metrics for patent classification include accuracy, precision, recall, and F1 score. While simpler neural networks were initially common, LSTMs effectively capture the context in patent texts and LLMs, with fine-tuning on patent data, can provide better context-aware representations, addressing the unique challenges of patent documentation (Automated Patent Classification Using Word Embedding, n.d.).
2. Patent and Image Retrieval
Patent Retrieval (PR) focuses on finding relevant patent documents and images based on specific search queries. It is important for identifying new patents, assessing their novelty, and ensuring they don’t violate existing patents. Additionally, retrieving patent images can inspire new designs. However, PR faces challenges. Text retrieval is tricky because new inventions can be described with different words and phrases, making it hard to find important information for checking patent infringements. Image retrieval is also difficult since patent images are often black-and-white sketches with numbered parts, complicating the search process.
Patent retrieval methods include a variety of approaches. Traditional Machine Learning techniques were among the first used, focusing on query expansion and finding semantically similar documents. Algorithms like SVM, Naive Bayes, and decision trees, help to retrieve prior art, which determines a patent’s novelty. Some studies merged search results using random forest and Support Vector Regression (Setchi et al., 2021). In recent years, Traditional Neural Networks have gained popularity, with methods like CNN, DUAL-VGG, and ResNet retrieving patent images based on query images, while BiLSTM and BiGRU focus on keyword identification and semantic relation extraction (Alla Kravets,2017) . Large Language Models (LLMs) are also proving effective for text-related tasks, including patent retrieval. BERT is used for analyzing titles, abstracts, and claims, while methods like TransE connect patents based on citation and inventor information. Overall, patent retrieval involves multiple tasks, such as defining requirements and merging results from various databases. Traditional machine learning methods like SVM and decision trees have limitations in capturing the complexity of patent images and texts. Although CNNs are popular for image retrieval, their effectiveness with technical patent images is still uncertain.
3. Quality Analysis
Businesses are increasingly focused on evaluating patent value because it greatly influences revenue and investment. Investors want to predict the future value of technologies, leading many companies to hire patent analysts for quality assessments. This process requires significant effort and expertise. Patent quality can be measured by citations, claims, grant lag, patent family size, and remaining lifetime. The main challenge in assessing patent quality is the unclear importance of the metric used. Common measures like citations and claims lack clarity on how much weight is given to each, and analyzing this information for a comprehensive evaluation can be complex.
The studies on patent quality analysis can be divided into parts:
Traditional Neural Networks: Various methods have been applied, including an MLP-based approach that uses indices like claim counts and citations to measure patent value. Other studies use Bi-LSTM to classify patents based on maintenance periods, while some predict forward citations and investor reactions using CNN-LSTM networks and other machine learning models.
Large Language Models (LLMs): MSABERT, a variant of BERT, assesses patent value based on textual data and utilizes OECD quality indicators. It effectively handles the complex structure of patent documents. Many measures exist for assessing patent quality, but there is no universally accepted standard, making it challenging. Forward citations are often linked to a patent’s value. While deep learning models have shown potential, integrating technical information, metadata, and images in patent documents is still a challenge. Although using MSABERT may be computationally demanding, it could enhance patent quality evaluation.
4. Patent Claim Generation/ Drafting.
Patents often contain a lot of text, which requires considerable human effort to draft patents. Generating specific sections of a patent, like the abstract and claims, typically involves using precise and technical language to describe the invention. AI tools can automate this drafting process, saving time and reducing costs for patent attorneys. However, a key challenge is understanding how different parts of the patent depend on each other. For example, an abstract or claim can be used to generate other sections of the patent. Moreover, crafting effective instructions for the AI to follow is also crucial. Finally, it’s important to find ways to evaluate whether the AI-generated content meets the required standards.
Generative models are gaining popularity in various fields, including patent generation, which can significantly reduce the time and effort needed to write lengthy documents. However, only a few studies have explored this area. While this method can create a first independent claim with minimal input, it lacks quantitative metrics to assess the quality of the generated claims, focussing on personalized claim generation using inventor-specific data and fine-tuned GPT-2. Although GPT-2 and BERT have been used, patents need precise language and often involve complex concepts that exceed the context limits of these models. Additionally, GPT-2 can produce vague text unsuitable for patents. Using more advanced models, like GPT-3.5, along with larger datasets for fine-tuning, could enhance the quality of generated patents.
List of Popular AI tool for Patent analytics
LSTM | Long short-term memory |
CNN | Convolutional Neural Networks |
Bi-LSTM | Bidirectional Long Short-Term Memory |
Word2Vec | – |
GRU | Gated Recurrent Units |
Bi-GRU | Bidirectional Gated Recurrent Units |
DUAL-VGG | Dual Visual Geometry Group |
FastText | – |
BERT | Bidirectional Encoder Representations from Transformers |
RoBERTa | Robustly Optimized BERT Pre-training Approach |
SciBERT | Scientific BERT |
GPT | Generative Pre-trained Transformers, |
References
AI-based classification for IP data. (n.d.). Ai-tools-services. https://www.wipo.int/web/ai-tools-services/classification-assistant
Choi, S. &. L. H. &. P. E. &. C. S. (2022). Deep learning for patent landscaping using transformer and graph embedding. ideas.repec.org. https://ideas.repec.org/a/eee/tefoso/v175y2022ics0040162521008441.html
Giordano, V., Puccetti, G., Chiarello, F., Pavanello, T., & Fantoni, G. (2023). Unveiling the inventive process from patents by extracting problems, solutions and advantages with natural language processing. Expert Systems With Applications, 229, 120499. https://doi.org/10.1016/j.eswa.2023.120499
Jiang, S., Luo, J., Ruiz-Pava, G., Hu, J., & Magee, C. L. (2020). Deriving design feature vectors for patent images using convolutional neural networks. Journal of Mechanical Design, 143(6). https://doi.org/10.1115/1.4049214
Krestel, R., Chikkamath, R., Hewel, C., & Risch, J. (2021). A survey on deep learning for patent analysis. World Patent Information, 65, 102035. https://doi.org/10.1016/j.wpi.2021.102035
Kelloway, B. (2024, February 23). Can AI Invent Independently? How AI is Changing the Patent Industry – IP.com. IP.com: Intellectual Property & Patent Intelligence Software. https://ip.com/blog/can-ai-invent-independently-how-ai-is-changing-the-patent-industry/
Learning Patent Speak: Investigating Domain-Specific Word embeddings. (n.d.). IEEE Conference Publication | IEEE Xplore. https://ieeexplore.ieee.org/document/8846972
Shalaby, M., Stutzki, J., Schubert, M., & Günnemann, S. (2018). An LSTM Approach to Patent Classification based on Fixed Hierarchy Vectors. In Society for Industrial and Applied Mathematics eBooks (pp. 495–503). https://doi.org/10.1137/1.9781611975321.56
Automated patent classification using word embedding. (n.d.). IEEE Conference Publication | IEEE Xplore. https://ieeexplore.ieee.org/document/8260665
Setchi, R., Spasić, I., Morgan, J., Harrison, C., & Corken, R. (2021). Artificial intelligence for patent prior art searching. World Patent Information, 64, 102021. https://doi.org/10.1016/j.wpi.2021.102021