Quantifying patent commercialisation to support engineering design

Lead Research Organisation: University of Strathclyde
Department Name: Design Manufacture and Engineering Man

Abstract

This project will investigate if crowdsourcing can be used to aggregate the content of disparate, open-data sources across the internet to determine which patents underpin commercial products, and organise and present these according to technical criteria in a visual "gallery" form appropriate for engineering design.

Patents are frequently used to quantify levels of innovation associated with specific regions or companies. However despite the development of sophisticated data mining tools to support the analysis of over 50 million online patent records, little is known about which patents are actually "commercialized" and how they are embodied in commercial products. Because of this "patent informatics" has been inherently limited to the study of the records, rather than the use, of Intellectual Property (IP). This information gap inevitably reduces the accuracy of academic and commercial analysis that use patent data for applications such as innovation research, R&D fore-sighting, and IP portfolio valuations. Furthermore, the presentation of existing data maps is not in a form that is useful for engineering designers when conceptualising and embodying products: it is predominantly text-based (and often deliberately obfuscated) when more visual presentation with exemplars and appropriate technical taxonomic terms would greatly enhance utility when undertaking engineering design development.

Crowdsourcing utilises large networks of open people to compete discrete tasks. Virtual tools are used to co-ordinate the distribution, payment and co-ordination of results, resulting in a labour market that is open 24/7 and a diverse workforce available to perform tasks quickly and cheaply. The distributed network of human workers provide on-line, "black-box", reasoning capabilities that could far exceed the capabilities of current AI technologies (i.e. genetic algorithms, neural-nets, case-based reasoning) in terms of flexibility and scope.

This project proposes that crowdsourcing can be utilised to access open data sources such as user manuals, product labelling, court proceedings and company web pages to understand which patents are actively used in current products and how they have been embodied. With a more accurate representation of innovation commercialisation, technical metadata (labelling), and utilisation, we envisage patent searches not as a stage-gate check but as a revitalised source of design inspiration. Indeed, if crowdsourcing proves a cheap, scalable way of collating this information and applying appropriate taxonomic and visual engineering information, it could fundamentally alter the early phases of engineering design. To this end, the project will result in a visualization tool that can be used to both guide and inspire design conceptualisation and embodiment.

Planned Impact

The project's results will deliver impact to engineering designers, innovators, government policy and academic researchers in the following ways:

1. Engineering designers - The project will result in improved performance during the conceptual phase of engineering design. This will be achieved by delivering technical patent visualisation maps/galleries that designers can utilise as a reference and source of inspiration when undertaking design work. This will increase competitiveness by shortening the development cycle and producing better design solutions. We will share the results of the research through academic journals and conferences to allow other groups to implement similar crowdsourcing techniques to extract the maximum value from the current patent databases. The formulation of visual galleries and maps to support technical problem solving, however, could be an aspect to the work that can be packaged as a tool with commercial value. This will be assessed at the end of project Phase 3.

2. Innovators - Innovators in general will be able to identify the strengths and limitations of existing patents more easily by investigation of the physical products that embody them. The members of the project's steering committee who are all professionally involved in the effective use of IP support of innovators will facilitate this on a practical level with the research team presenting the principles of crowdsourcing and patent visualisation to the wider academic community.

3. Government policy - The results have the potential to significantly impact on policy and procedure in regards to patent management. As we have outlined in the case for support, there have been recent moves in the US to assess how crowdsourcing can be used to derive more meaningful information from the existing patent databases. If the proposed protocol proves successful, this could be applied to areas beyond engineering design and influence how governments and patent offices utilise data information as a means of innovation measurement. The project steering committee includes representatives from an executive agency (Intellectual Property Office), a public body (Scottish Enterprise) and a commercial patent attorney (Marks & Clerk). Ensuring the voices of all these groups are heard during the development and formulation of the crowdsourcing protocol is critical in ensuring the proposed approach is viable. The four steering committee meetings will act as a forum to review state-of-the-art and trends in patent policy, and also provide access to end users (patentees). By applying the initial results of the research within the steering committee members' organisations, we can assess reaction and effectiveness of the work, and inform our wider approach to impact.

4. Academic researchers - Those investigating the economic impact of patents will have quantifiable data on the products associated with particular patents. The data set generated through project Phase 2 (Scale up and benchmark) will be made available through the project website for others to download and utilise. Dissemination on the application of crowdsourcing process will be via conference presentations and specialist journal papers (e.g. World Patent Information). Similarly, the presentation of the patent visualization method and its implications for the design process will be presented to engineering design researchers via high impact journals such as Design Studies and Research in Engineering Design. Publication activities will be led by the PI with support from the RA and the rest of the academic team.
 
Description Today although there are over 50 million online patent records instantly available, understanding of their impact on innovation has never been harder. The volume and language of patents combine to make interpretation of their contents and assessment of their significance in the context of any given project a laborious process. The problem is exemplified by so called "patent thickets" defined as "a dense web of overlapping intellectual property rights that a company must hack its way through in order to actually commercialize new technology" (Shapiro, 2000). Given this designers need new tools to allow them to quickly and accurately understand the "patent landscape" in the context of a new design or innovation.
This research investigates the feasibility of using a crowdsourcing process to cut through the patent jungle and deliver concise summary of the relevant Intellectual Property and its applications in an area of interest. Key components in crowdsourcing workflows are repletion (i.e. multiple, parallel tasks to generate sets of "answers"), peer review and merger, iteration, and the linkage of payment to quality assessments. This proposal seeks to quantify how well these techniques could be used to locate relevant patent records, summarize their contents and collaboratively construct infographic that shows the relative strength of clustering around topics. In other words the project will focus on the use of the crowd to provide the designer with two specific areas of functionality: 1) Patent Landscape Visualisation; 2) Patent usage assessment.
Exploitation Route This work is cross-disciplinary and while the output of the research is aimed towards the engineering design community, the results will be of interest to researchers in a number of different fields. Firstly, the construction of an appropriate task template for distribution to the crowd will be of interest to researchers in the growing field of crowdsourcing. Framing of tasks, identifying reliable crowd members, setting an appropriate reward structures are all challenges that beset crowdsourcing activities, irrespective of the field of application. Secondly, this work has implications for researchers in patents and innovation, particularly in relation to informatics. Patents are widely used as a means to assess innovation activity but the clearer visibility of commercialisation patterns proposed by this research will introduce a new layer understanding. Finally, the project output in terms of patent visualisation will be presented for the engineering design community. Work on design methods through design conceptualisation and embodiment has highlighted the challenges of understanding competition and utilising relevant examples. The patent visualisation map/gallery will be aligned with current methods and presented as a new model for assessing the innovation landscape and facilitating concept generation.
Sectors Creative Economy,Digital/Communication/Information Technologies (including Software),Education

 
Description Our research addressed the Design the Future theme of 'new design paradigms' through enhanced use of the patent database. The project was innovative in utilising crowdsourcing as the means to identify and organise relevant patents for use in the engineering design process. Given the potential advantages of aggregating human insight to overcome the limitations of computer algorithms and experts, the aim was to establish this as a cheap, scalable way of finding, collating and applying appropriate taxonomic engineering information via patents. The project has: • Identified a 'search, cluster, utilise' thematic structure for the exploitation of patents • Developed a new research platform with appropriate crowd workflows • Evaluated clustering protocols for the crowd to effectively group and prioritise patent sets • Identified 'collaborative tagging' as a novel means to facilitate patent searching Thematic structure The project began by identifying key themes: finding relevant patents (searching), organising groupings in relation to particular problems (clustering), and applying these in engineering design activities (utilising). It is clear that although computational algorithms are maturing, human intervention remains essential to develop appropriate search strategies and to bring insight to patent clusters. Additionally, while there are various procedural approaches for the use of patents in the early phases of the design process (e.g. TRIZ, forced analogy, relational words), there is limited guidance on how unabstracted design knowledge can be extracted and applied. Four engineering activities across the development process have therefore been identified as a basis for exploration: scoping (opportunities and constraints), generation (inspiration and functional coupling), embodiment (context and object-oriented views) and testing (checking and commercialisation assessment). Research platform A major output of the feasibility study is the creation of a software interface to manage the distribution, execution and aggregation of patent-related tasks to the crowd. The system allows online workers from established public platforms (e.g. Amazon's mTurk and Crowdflower) to engage in tasks much more complex than their normal work. It features visually-orientated assessment and organisation, the ability to filter crowd workers based on performance, and integration with appropriate payment systems. The basic architecture has been constructed using open source applications. The software has allowed the effectiveness of different approaches to clustering to be tested while also proving support for search, back-end processes and presentation of results. The platform will allow future experiments to be quickly implemented and trialled. Clustering Patent clustering helps identify meaningful patterns for forecasting technological trends, detecting infringement, and identifying technological vacuums. It has, however, consistently proved difficult to develop useful and consistent groupings, and is resource intensive. We therefore performed comparative analyses of the crowd with design experts, computer algorithms and commercial landscaping software within a 45-patent subset to assess its benefits. This included the use of data from previous research published by Fu et al. on computationally structured databases. The results for the 45-patent benchmark showed that the crowd workers created distinct and functional cluster labels with speed and cost benefits. Collaborative tagging We have identified a novel approach to the sourcing of patents - what we have termed 'collaborative tagging'. This provides the crowd with diagrammatic layouts or assemblies that describe the problem or product in question, with patents then sourced and mapped as shown in the Figure 1 'Search' theme. We anticipate combining this with wider, 'blue sky' searches and articulating it with the cluster protocols described above. Conclusions and initial impact The project has had allowed us to demonstrate the economic and intellectual viability of a crowdsourcing approach to the organisation of patents. The emergent themes have assisted the Weir Group in understanding patent management issues (mapping to products, unused patents, cross-domain applications) as well as demonstrating the cost benefits of crowdsourcing in comparison to using IP consultants and bespoke software platforms. It has highlighted a number of other industrial tasks such as patent status monitoring, geographical spectrum expansion, and technology scanning which Weir Group are currently following-up internally. In terms of outputs, a research tool (i.e. the software platform) with significant potential for expansion has been constructed, allowing patents to be utilised more effectively in a range of engineering design activities. The economic and time benefits in relation to clustering activity have been benchmarked for a given data set, and a potentially innovative approach to searching identified for further development. Other project impacts include: • Staging of four project steering committee meetings which have provided a platform for discussion and developed relationships across the partner organisations. • Dissemination at two international conferences (IPDMC Glasgow and DCC16 Chicago). • Facilitation of MSc group project for Weir Group entitled 'Managing IP for Innovation'. Subsequently Weir funded two of the group as interns to further develop the findings. • Employment of a summer intern who explored 'search by image' strategies, producing a workflow for better patent searching and fed into the collaborative tagging research theme. • Inclusion in the University's 'Images of Research' photographic exhibition, with accompanying video and literature on project aims. • Participated in Strathclyde Institute of Operations Management showcase event around the manufacturing themes of productivity, sustainable business models and leadership. • Emerging collaboration with MINES ParisTech to explore the use of alternate design theories in the crowd platform to extract patent knowledge.
First Year Of Impact 2016
Sector Creative Economy,Financial Services, and Management Consultancy,Manufacturing, including Industrial Biotechology
Impact Types Societal,Economic

 
Title Protocols to support patent clustering using crowdsourcing 
Description We have developed workflows that allow the task to effectively undertake clustering-related activity for patent analysis. Further work will broaden this to encompass searching and utilisation activities. 
Type Of Material Improvements to research infrastructure 
Year Produced 2016 
Provided To Others? Yes  
Impact The workflows have been implemented in our software platform and used for initial experimentation and benchmarking. As we validate its effectiveness, the platform could be used more widely by organisations wishing to undertake patent analysis. 
 
Title Dataset to find effectiveness of using crowd intelligence for generation of patent clusters 
Description This dataset contains data collected and analysed to prove advantageous of using crowd intelligence for effective generation of patent clusters at lower cost and with greater rationale. The dataset structured around two major undertaken tasks: patent clustering and ranking patent clusters to the given design problem. Please read our journal paper "Crowd-generated patent clusters in relation to the algorithm and expert approaches" before understanding this dataset. The file names are numbered to facilitate the order of exploring this dataset. The details of uploaded data documents are illustrated below: - Documents 1 and 2 provide all information collected from crowdsourcing in Crowdflower and mTurk respectively. Please note that these documents contain both accepted and rejected results. - Documents 3 and 4 provide information contents that are accepted for further analyses from Crowdflower and mTurk respectively. - Document 5 contains information about the number of clusters and patent pairs for every approach. The second sheet contains a graph between the number of clusters and patent pairs. - Document 6 contains matrix between patent pairs and crowd workers. The patent pair is linked if marked with '1', otherwise '0'. The second sheet contains a graph between the number of crowd workers in agreement and frequency of linked patent pairs. The third sheet contains a graph comparing different approaches between the number of workers in pair agreement and percentage of pair agreement. - Document 7 contains all cluster labels generated by various approaches. Different colours are used to differentiate between approaches. The number '1' is used to represent label presence in a particular approach. This number may be more than one represent the frequency of times mentioned by experts/crowd. - Document 8 contains patent relevance ranking mentioned by each crowd worker. The first and second sheets are used to represent relevance for mTurk and Crowdflower workers respectively. The aggregations of rankings are provided on the extreme right of both sheets. Based on the aggregated results, the shown relevance rankings are represented in red colour. The third sheet provides a summary of these aggregated results. The last sheet provides a graph between patent ranking and number of patents for crowd and Fu's algorithm approach. - Document 9 contains the evaluation of crowd responses by three evaluators. For each patent pair, evaluators score is mentioned as '1' if they agree, '0' if they don't. These information contents were used to illustrate the relationship between the number of agreed evaluators and the number of crowd workers agreed with a patent pair. - Document 10 details all the labels chosen by three evaluators from the crowd responses. The data contents illustrate evaluators agreement with both patent pairs and labels, only patent pairs, no patent pairs match and extra labels generated. 
Type Of Material Database/Collection of data 
Year Produced 2017 
Provided To Others? Yes  
Impact Unknown 
 
Title Dataset used to establish a new model of patent database interpretation for user-centred design 
Description This dataset contains data collected and analysed to establish a new model of database interpretation for user-centred design by realising the affective potential of patents. The dataset structured around two major undertaken tasks: crowd responses and matrix between adjectives/key phrases/patents. Please read our journal paper "Realising the affective potential of patents: a new model of database interpretation for user-centred design" before understanding this dataset. The file names are numbered to facilitate the order of exploring this dataset. The details of uploaded data documents are illustrated below: - Document 1 provides all information about the patent set used in this work. This document contains patent number, abstract and main image web link information of 60 patents. - Documents 2, 3 and 4 contain all the accepted responses from the crowd for ease of use, semantics and visual attractiveness parameters respectively. These documents also contain adjectives and respective frequency in the second (from the first question) and third sheets (from the second question) of the Excel documents. - Documents 5 and 6 contain semantic relatedness score in matrix form relating adjectives with patents, and adjectives and functional key phrases with patents respectively. Please note that the size of both of these files is large. 
Type Of Material Database/Collection of data 
Year Produced 2018 
Provided To Others? Yes  
Impact Unknown 
 
Description Emerging collaboration with MINES ParisTech to explore the use of alternate design theories in the crowd platform to extract patent knowledge 
Organisation Mines ParisTech
Country France 
Sector Academic/University 
PI Contribution Emerging collaboration to explore the use of alternate design theories in the crowd platform to extract patent knowledge.
Collaborator Contribution Expertise in C-K Theory.
Impact Initial experimentation work completed, planned joint publication in 2017.
Start Year 2016
 
Title Crowdsourcing platform 
Description Basic software architecture that provides a platform for the distribution of crowd tasks in relation to patent analysis 
Type Of Technology Webtool/Application 
Year Produced 2017 
Impact The platform has allowed us to gather data and benchmark our protocols for the analysis of patents. Further development may make this a viable solution for organisations to use and adapt for patent analysis. 
 
Description Project steering group 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact The project steering group consists of representatives from Scottish Enterprise, the Intellectual Property Office and HGF patent attorneys. The findings are informing their current practice as well as shaping the research agenda going forward.
Year(s) Of Engagement Activity 2015,2016