Homepage | Publications | Software | Courseware; indicators | Animation | Geo | Search website (Google) |
Portfolio Analysis and Maps in terms of Patent Classes
The resulting figure (in VOSviewer) can be used for inspection of a portfolio; one can also make overlays for different years, and thus generate animations using PowerPoint.
1. Preparing input files
a. Download the following files from http://www.leydesdorff.net/ipcmaps into a single folder:
· ipc.exe;
· ipc.dbf (with basis information about the classes);
· uspto1.exe (needed for the downloading of USPTO patents);
· cos_ipc3.dbf and cos_ipc4.dbf (needed for the computation of distances on the map);
b. Run ipc.exe.
2. Options within ipc.exe
a. The program asks for a short name (≤ 10 characters) in each run. This name will be used as the variable name in later parts of the routine;
b. The first option is to download the patents from USPTO at http://patft.uspto.gov/netahtml/PTO/search-adv.htm ; detailed instructions for the downloading can be found at http://www.leydesdorff.net/ipcmaps;
c. USPTO has a maximum of 1000 records at a time; but one is allowed to follow-up batches; after each download, save the files in another folder or as a zip file;
3. The incremental construction of the files matrix.dbf and rao.dbf
a. After each run, a column variable is added to the (local) file matrix.dbf containing the distribution of the 630 CPC/IPC classes in the document set under study. If the file matrix.dbf is absent, it is generated de novo and the current run is considered as generating the first variable; matrix.dbf can be read by Excel, SPSS, etc., for further (statistical) analysis;
b. Similarly, a row variable is added after each run to the file rao.dbf containing diversity measures (explained in the article) as variables. This file is also de novo generated if previously absent. Distances are based on [1 – cos(x,y)] for each two distributions x and y;
c. The routine ipc2cos.exe reads the file matrix.dbf and produces cosine.net and coocc.dat as (normalized) co-occurrence matrices that can be used in network analysis and visualization programs such as Pajek or UCInet.
4. Output files in each run
a. Two input files (vos3.txt and vos4.txt) are generated for mapping the portfolio at the three- or four-digit level of CPC/IPC, respectively, using VOSviewer; the distances and colors (corresponding to clusters) in the maps are based on the base-map (Leydesdorff et al., 2014);
b. Two input files (ipc3.vec and ipc4.vec) can be used as vectors in Pajek files provided at http://www.leydesdorff.net/ipcmaps . This allows for layouts other than VOSviewer and for more detailed network analysis and statistics;
c. The various fields in the USPTO records are organized in a series of databases that can be related (e.g., in MS Access) using the field “nr”.