EnroWiki : CitationAnalysis

Analyses de citations dans les journaux et les brevets

Les citations dans les brevets

One can gauge the pace of innovation by measuring median time lags from citing applications to cited patent grant dates. […] citation rates vary by year of issue (earlier patents have had more time to be cited) and by technology area. Normalization is necessary. (Porter, Alan L. and Cunningham, Scott W., Tech mining: Exploiting New Technologies for Competitive Advantage, John Wiley & Sons, 2005 p. 238)

[The citation] information gives the closest state of the art detected by a patent examiner during the statutory search. When it takes him about 5 min to classify a patent according to the IPC scheme, his search lasts for about 1 day. In other words the work involved in producing the citation field represents the biggest added value to a patent document. It is therefore legitimate to use this field as much as possible. (Schwander, Paul, An evaluation of patent searching resources: comparing the professional and free on-line databases, in 22 World Patent Information, 147-165 (2000))

Distinguer :
1) utilisation courante des citations (comme outil de recherché ainsi qu’indiqué par Garfield, reverse-searching, comprendre le champ de l'invention)
1) citation studies qui peuvent avoir comme but :
a) scientométrie (evaluation d’un organisme de recherche, etude des liens entre recherche et technologie…)
a) IP management & competitive intelligence (portfolio value evaluation, competitor tracking, licensing opportunities…)

Little has been published on the cognitive and sociological functions of citations in patents (Collins & Wyatt, 1988). This is in marked contrast to the many studies on citer motivations in journal articles (Cronin, 1984). (Oppenheim, Charles, Do Patent Citations Count?, in Cronin, Blaise & Barsky Atkins, Helen (ed.), The Web of Knowledge: A Festschrift in honor of Eugene Garfield, Information Today, Inc., 405, 2000)

Some of the citations made by a examiners appear bizarre to librarians or information scientists. Garfield (1979, p. 39) commented cautiously that “there are no categorical answers” to the question of items cited by patent examiners are indeed relevant to the subject area. Garfield (1966), Oppenheim (1976), Dunlop & Oppenheim (1980) and van Dulken (1999) provide evidence that many examiner citations are inappropriate. […] only about one third of all examiner citations have a close relationship to the subject matter of the citing patents, although they are in the same broad technologies. […] One particularly interesting result was that where two patents in the same family were checked, one an EPO patent and the other a US patent, more than 75% of the references cited were different. (Oppenheim, Charles, Do Patent Citations Count?, op. cit.)

Despite all this confusion, there can be no disputing Garfield’s statement (Garfield, 1979) that “a citation index of the patent literature identifies relationships between patents that are not identified any other way.” (Oppenheim, Charles, Do Patent Citations Count?, op. cit.)

The approach taken in this paper is to break down the use of patent citation analysis into five sub-headings:

Evaluation of the performance of an industry or a country’s technology.
Tracing the transfer of knowledge from science to technology, from technology to technology, or from the defence to civil fields. This is currently the most popular area for research.
Identifying key earlier patents for patent litigation purposes, for identifying the history of a technical subject, and in particular for identifying key pioneer patents.
Identifying the speed of development of a technical subject
Miscellaneous applications (Oppenheim, Charles, Do Patent Citations Count?, op. cit.)

Both Trajtenberg (1990) and Albert et al. (1991) point out that the technological impact and commercial value of patents are two separate issues. Because of the difficulties of separating out these two factors, most researchers are content to rely on the use of terms such as “importance”. As Narin, Noma & Perry (1987) point out, patents are probably highly cited for two sometimes interrelated reasons. The first is that they are seminal patents; this implies that the originating company will have a disproportionate share of that technology. Secondly, high citations are often due to follow up patents from the same company. That is, highly cited patents are often part of a tightly interlocked stream of inventions from the company. Overall, then, the evidence is still inconclusive, but is there some evidence that that highly cited patents are indeed those that are technologically or economically important. (Oppenheim, Charles, Do Patent Citations Count?, op. cit.)

However, the most detailed criticism of patent citation analysis can be found in Kaback, Lambert & Simmons (1994). These three authors are extremely experienced patent searchers from the chemical and pharmaceutical industries, and their criticisms need to be considered carefully. They emphasise that patents are not governed by the same rules of etiquette as journal articles. The references that are made by the applicant rarely look like the bibliography of a journal article. The authors of the patent application wish to avoid any implication that the current patent application grew naturally out of earlier work. Thus, most of the prior art that is cited by the applicants relates to unsuccessful approaches to the question. Turning to the examiner citations, these are driven solely by the claims in the applicant’s patent specification. The claims precisely define the monopoly right that the applicant wishes to gain. The examiner is only required to cite one reference that anticipates the claim in some way. The examiner may add further citations, but the primary function of the citations remains the same – to prove that what is claimed is not new. This point should be stressed: the text of the patent claims is not identical to, and does not necessarily reflect, the text of the remainder of the patent specification. The examiner’s preoccupation with the claims means that the items cited by the examiner do not necessarily reflect the bulk of the patent specification. […] The authors conclude that patent citations are useful as subject matter search tools, just as Garfield suggested. They also agree that frequently the highly cited patents are indeed the industrially important breakthroughs. In summary, they warn strongly against simplistic use of citation counts. (Oppenheim, Charles, Do Patent Citations Count?, op. cit.)

Patents provide far fewer citations than journal articles, so the possibilities for statistical analysis are limited. There is no evidence that either examiner or applicant citations reflect the subject matter of the citing patent. Their reasons for citing are different, but neither is to do with providing a useful background literature survey. In any case, there is a paucity of understanding about examiner and applicant citing motivations. (Oppenheim, Charles, Do Patent Citations Count?, op. cit.)

The use of citations as a quality indicator of patents is advocated by empirical research, which has established a positive relationship between the citation frequency of patents and different measures of commercial success. (Ernst1998 p.7)

Les citations dans les revues

Calcul d'indicateurs scientométriques

Vikler (2000) summarizes the applicability of a range of metrics, varying in degree of sophistication, for evaluating the performance of research teams, differentiating "gross" indicators (e.g., raw counts of citations received) from "specific" indicators (e.g., number of citations per paper or per researcher), "distribution" indicators (e.g., proportion of total citations received by all research teams being compared), and "relative" indicators such as Vinkler's Relative Citation Rate (RCR) — the number of citations received, divided by the sum of the impact factors of the journals where the cited papers were published. This last metric is an example of a measure that compares counts of observed citations with estimates of some "expected" citation score, and is similar to the categorical journal impact used by ISI in their "macro" journal studies (see Garfield, Eugene, Journal Citation Studies. 20. Agriculture Journals and the Agricultural Literature, in Current Contents, 20, 5--11 (1975), http://www.garfield.library.upenn.edu/essays/v2p272y1974-76.pdf

for an early example of such a study). (Borgman, C & Furner, J, Scholarly Communication and Bibliometrics, in Cronin, Blaise (ed.), Annual Review of Information Science and Technology, Information Today, Inc., Medford, NJ, vol. 36, 3-72, 2002, http://polaris.gseis.ucla.edu/jfurner/arist02.pdf

)

Auto-citations

Self-citation raises similar questions and exists at all levels—the laboratory, the institution, and the nation. Neutralization at one level, mentioned by Van Rann, is a partial solution. Self-citation seems a more serious issue in performance rating applications than in structuring and mapping applications. (Zitt, Michel, Facing Diversity of Science: A Challenge for Bibliometric Indicators, in 3 Measurement, 1, 38--49 (2005), http://www.obs-ost.fr/doc_attach/FacingDiversityOfScience.pdf

)

Sociologie de la citation, théorie générale, motivations et comportement

Echoing Cronin (1984), calls for a "theory of citing" have long been a regular feature of the bibliometric literature. In a discussion paper published in Scientometrics along with invited responses from such as Cronin (1998), Egghe (1998), and Kostoff (1998), Leydesdorff (1998) re-articulates the plea ("Citation analysis calls for a theory of what is being analyzed; citation analysts consequently tend to be in need of theoretical legitimation" (p. 5)), and supplies a major contribution to the debate about possibility and nature of a theory of this kind. Leydesdorff distinguishes between at least two things to be explained in any theory of citing: the citation per se, and citation analysis as an area of study. He sketches the histories both of citation practice (identifying shifts over time in the function and role of citations), and of citation analysis, positioning the latter in the framework provided by the interdisciplinary field of science and technology studies. He paints a rich portrait of the inherent complexity of citation practice, arguing that citation networks are dual-layered (the result of interaction between first-order, social networks of authors and second-author networks of "communications" or texts). He uses this distinction to demonstrate that any individual cited-citing pair may be viewed as an author-author, text-text, author-text, or text-author relation, as well as either at a disaggregated (micro-) level or at various (macro-) levels of aggregation, and suggests a two-facet taxonomy of the functions of citations on this basis. He further concludes that social and cognitive perspectives on citation practice are equally necessary; that there thus exists a multiplicity of theories of citation; and that it remains "uncertain" whether a meta-theory reconciling the insights, for example, of qualitative and quantitative studies is attainable. (Borgman, C & Furner, J, Scholarly Communication and Bibliometrics, op. cit.)

At the most general level, much current intellectual development in citation studies is related to a tendancy for research designers simply to take more seriously the notions that citer behavior, like relevance judges' behavior in general (Schamber, Eisenberg, & Nilan, 1990), is (a) individual and subjective — in that different people, even when placed in otherwise similar situations and taking into account similar factors, will make different decisions; (b) complex and multidimensional — in that single decisions are often based on multiple factors, and multiple kinds of factors, simultaneously; and (c) dynamic and situational — in that, on different occasions or when placed in different situations, people take account of different factors and make different decisions.
Studies of relevance judges' behavior — i.e., studies of those decisions and actions of information seekers that are based on their judgments as to whether or not particular documents are relevant to them in particular situations — are core to the sub-field of library and information science (LIS) that is devoted to understanding information-related behavior. Furthermore, the perception that there is an important analogy to be drawn between linking behavior and the making of relevance judgments has been expressed with increasing frequency. Harter (1992) puts it as follows: "An author who includes particular citations in his list of references is announcing to readers the historical relevance of these citations to the research; at some point in the research or writing process the author found each reference relevant. Relevance is the idea that connects IR to bibliometrics, and understanding in one context should aid our understanding of it in the other." Studies of linking behavior may thus be explicitly positioned not simply as contributions to the general literature of information-related behavior, but specifically as close relatives of impressive recent work that has led to an improved understanding of the criteria used by information seekers when judging relevance. (Borgman, C & Furner, J, Scholarly Communication and Bibliometrics, op. cit.)

Reports of notable studies in which researchers have sought to elicit citers' opinions about their own citing activity have appeared in three recent articles. (…) Their [Shadish et al. (1995)] results were that a highly cited work is more likely than a less-cited work to be the following:

perceived as an "exemplar" — i.e., as a classic reference in a field, as a "concept marker", as a representative of a particular genre, as one of the earliest works in a field, as authored by a recognized auhtority, as generative of much novel work, or as especially resistant to falsification;
old;
perceived as "high quality"; and
perceived as a source of a method or a design feature.

Most significantly of all, however, a highly cited work is less likely than a less-cited work to be perceived as "creative". Shadish et al. posit the existence of high quality but poorly cited articles "that are creative in a way that does not ft into existing conceptual frameworks or into accepted social norms for scolarship in an area".
Shadish et al. were led from their findings to conclude that, although citation counts are correlated with perceptions of quality, quality is not the only factor that has an impact on citation counts, and other such factors are themselves not correlated with quality. (Borgman, C & Furner, J, Scholarly Communication and Bibliometrics, op. cit.)

(…) White and Wang (1997) concluded (p. 147) that "citing behavior is complex, multidimensional behavior" and summarized their findings roughly as follows. Firstly, the "topicality" and "content" of the cited document were the most commonly used criteria on which citation decisions were based, although numerous other criteria were used on multiple occasions. Secondly, the choice of citeria in a particular instance seemed to depend on the "frame of reference" or purpose prioritized by the citer at that instance (e.g., execution of the research project of of the immediate task, augmentation of the field, satisfaction of external judges, etc.). Thirdly, some "metalevel" beliefs influence citation decisions independently of considerations of the ways in which individual documents can be used: these include beliefs about the value (even the morality) of self-citing, of copy-citing (copying citations found in other citers' papers), of citing secondary sources, of citing articles from peripheral journals, and of citing to meet external judges' expectations. White and Wang suggest that it might be possible, on this basis, to identify particular styles or codes of citing, and that certain styles may be characteristic not just of individual citers but of disciplines. (Borgman, C & Furner, J, Scholarly Communication and Bibliometrics, op. cit.)