Autism Causes Bibliometric Research

Chapter 3

This chapter describes the design of the study and discusses methods for carrying out this design. The researcher explains how the sample was selected and procedures used to characterize data selected. This chapter closes by acknowledging limitations to the sampling and analysis and suggests how these limitations might be overcome.

Design of the Study

Selection of the Sample

The goal is to map all of the autism causation literature as it has evolved since its origin in the 1940s, specifically from 1943 when the disability was first noted (separately by Kanner and by Asperger) through etiology research published by December 1996 and present in bibliographies published before July 1997. The emphasis is on abstracts of journal articles. The three main sources used will be MEDLINE, EMBASE, BIOSIS Previews (and their respective print counterparts Index Medicus, Excerpta Medica, and Biological Abstracts, especially for coverage of the early years of autism, but also for source comparison research as well). These main sources will be supplemented with the National Library of Medicine’s Current Catalog, PsycLIT, PsycINFO (and the print version for these two, the American Psychological Association’s Psychological Abstracts, again for the critical early years of autism causation research).

In her Autism Treatment Guide, Gerlach credits the 1964 publication of Infantile Autism: The Syndrome and its Implications for a Neural Theory of Behavior with debunking “the assumption that autism was a result of ‘bad’ parenting.” Therefore, a considerable amount of the literature in the first twenty plus years of autism is in the field of psychology. Although the primary three sources often abstract psychology-related journals, specifically adding psychology abstracts to the search should provide a fairer guarantee of pulling in causal theories from these early years. Even now, autism is behaviorally diagnosed, not by blood tests or brain scans. Therefore, the investigator intends to search widely by concentrating on medical, biological, and psychological databases and bibliographies, and using additional sources—such as the Internet, conference proceedings, and monographs—mainly for confirmation and supplemental information. For example, two sources found not on but through the Internet will be used: “A Collection of Abstracts on Genetic Research of Autism” (no longer on the Web; formerly at http://www.udel.edu/bkirby/asperger/genetic_abstracts.html) and The Autism Research Database’s “Titles in Autism” (published by the Autism Research Unit at the University of Sunderland aru@sunderland.ac.uk).

Procedures

Data Collection

Two data filters are introduced with the selection of the bibliographies for this bibliometric mapping. The first is the researcher-controlled decision of which files to use. The other is the inclusivity of the files themselves, though the four digital bibliographies could be called “exclusive” only in the subjective sense. With coverage back to 1974, EMBASE indexes “over 3,500 biomedical and pharmacological journals published throughout the world.” Since 1966, MEDLINE has been the digital counterpart not only to Index Medicus but also to the Index to Dental Literature and the International Nursing Index. It “covers virtually every subject in the broad field of biomedicine, indexing articles from over 3,700 international journals” published in seventy-one countries. BIOSIS Previews began in 1969 and indexes over half a million research accounts yearly from “nearly 7,600 primary journal and monograph titles, . . .notes, letters, . . .government reports, and research communications.” The 7,600 is updated to “nearly 9,000” in the DIALOG Bluesheets on the World Wide Web. The site (http://www.dialog.com/plweb-cgi/idoc.pl?350+unix+_free_user_+web6.dnt.dialog.com..80+dialog+dialog+bluen+bluen++bluesheets#PD) had the July 18, 1997 Bluesheets loaded when the researcher visited. The American Psychological Association’s site (http://www.apa.org/psychnet/) states that PsycINFO indexes “over 1,350 scholarly journals.” PsycINFO covers from 1967 to the present and pulls from the international literature in such fields as “behavioral and social sciences, . . .developmental psychology, educational psychology, experimental human. . .psychology, personality, physical and psychological disorders.” Clearly, once the decision to use a DIALOG file is made, limits on the data sample due to indexing exclusivity are a mild concern, but not one the researcher can control.

Bibliographies are generally used most frequently for bibliometric studies because of their low cost and comprehensiveness. However, because they do not give enough information to stand alone and may not react quickly enough to changes in the subject literature, they provide “no more than a first step toward. . .a literature description.” Abstracting and indexing services have improved the statistical scope of bibliometric studies. These index down to the article or review level, rather than at the broader journal or book level. This can be useful or not, depending on the search. However, a clear drawback of the indexing vendors is that completeness is not guaranteed. Despite the improved indexing mentioned above, these services are selective, tending toward the “scholarly” journals, and do not abstract even their core journals from cover to cover. They do, however, make “simple” statistics available and are more “user-oriented” (if not user-friendly) than other secondary services. Nicholas and Ritchie offer ways to assess coverage patterns, currency, and overlap. They suggest knowing before using the service what its policy on coverage is as to subjects, titles, languages, countries, document forms, and frequency. While discussing the “currency” issue, Nicholas and Ritchie caution care in filtering out the actual versus “intended” date of publication and state in general terms that journal articles appear quickly, reports/proceedings take a little longer to index, and language translations may take years. The authors also state the value of checking more than one vendor, even when the subject range appears the same, because the duplication from one service to the next is seldom more than 75 percent.

A search for autism causation literature listed in sources published before January 1997 (and indexed in the bibliographies by June 1997) governs the data collection phase. Since the cut-off date for the publication of the bibliographies is the first half of 1997, nearly all of the material researched will have been published in 1995 and before, with the online databases possibly abstracting some of the early 1996 material, but not the full year. Query construction will use the “autism” root and “pervasive developmental disorder” as well (autism’s “parent” category in the DSM-IV). Linked to this set will be the “cause” and “etiology” root words. (A copy of the actual OneSearch query from DIALOG is Appendix A.) Since “cause” has many meanings and uses, it could generate many false drops. Rather than construct the query in an attempt to guard against this, the false drops will be filtered during reading of the abstracts to determine whether the sources they represent are truly about autism causation. If the article discusses only an autism treatment or some test of autistic ability without taking a specific stand on the issue of how autism happens, it will be treated as a false drop.

The study does not limit itself to autism causation research within the realm of the four possibilities mentioned. As other causal categories present themselves, additional values can be added to incorporate them. For this study, the operational definition of “autism causation literature” is any document for which the abstract or the source itself uses both “autism” (its root or its “parent” characterization, “pervasive developmental disorder”) and a “cause”/“etiology” derivative which clearly assigns a cause to the defect. For example, the easiest source abstracts to characterize contain comments such as “this research proposes a genetic model for autistic development;” “autism is the result of a CNS [central nervous system] dysfunction, not a hearing deficit;” and “it is clear that autism is a neurologically-based developmental disorder.” Others that are also clearly stated, but require some deduction, include “although autism is associated with a wide variety of genetic conditions, it is a behaviorally defined phenotype arising from CNS damage” [neurological, not genetic] and “there was an extremely high frequency of autism (11.4%) in a sample of 70 children of cocaine mothers” [environmental].

The researcher expected more than half of the sources reviewed to be false drops. Three reasons for this include: (1) “cause” can have a meaning more general than “etiology,” (2) “autistic-like” can be used in an abstract which otherwise has nothing to do with autism, and (3) articles may be about autism, but not about autism causes. But the main reason for so many false drops is the researcher’s attempt to identify the items first hand and not rely solely on the Boolean logic of query language. All of the synonyms for “autism” were developed by the researcher. (Fortunately, most of these have “autis” somewhere in their name, like autistic spectrum disorder and infantile autism.)

Nesting (that is, parenthetically pairing off terms to control the order in which they are searched) was used, but there was no NOT-ing out “treatment” or other filtering out of undesired words. Preliminary experimentation with NOT-ing out terms indicated that too much was removed. To test this preliminary finding (and the assumption that more is published about autism treatments than is printed about autism causes), the researcher conducted a MEDLINE search using “autis? and treatment?” and another for “autis? and caus?.” The former returned 551 citations; the latter, 135. NOT-ing out “treatments” from the “causes” brought this 135 down to 117 hits. (NOT-ing “causes” out of the “treatments” search similarly reduced the 551 to 533 and hints at further complications to the search for autism causation literature.) The commonality of the words “treatment” and “cause” were the main reason for the researcher adding the more scientifically precise “etiology” and also his reason for filtering out the false drops manually.

How to harvest and filter the information contained in each file was a concern. Searching the DIALOG files individually, using SearchSave, or using OneSearch seem the obvious options for this autism causation search. The researcher opted for using OneSearch because the query was not so complex that it needed saving. However, the main appeal of DIALOG’s OneSearch option is that it allows for removing duplicate abstracts (the “rd” command).

The next issue for resolution was how to track data collection and enable the sorting necessary to aid in its analysis. To manage this, the researcher set up a simple database with the following categories:

(1) record number,

(2) suggested cause/number giving this cause [this number served as a running total],

(3) author(s),

(4) title of article,

(5) country of publication,

(6) year of publication,

(7) format,

(8) abstract source, and

(9) article source.

The first category, record number, was created mainly to provide a running tally of sources used. However, this field distinguished itself from the second for the many articles that postulate multiple etiologies. That is, the degree to which the total number of etiologies exceeds the number of records used is accounted for by the double and triple causation sources. There are, then, 1,543 causes attributed in 1,259 database records.

Characterizing articles that deal with more than one causal hypothesis was handled in the following manner: if they merely stated what causes are possible, they were not counted in the research, but treated as false drops. If an article gave equal weight to two or more distinct causal hypotheses, it was count for two or more. If the source leaned in favor of a certain causation, it was considered under this cause only. Obviously, for sources like this, the source itself had to be evaluated, not only its abstract. For a clear majority of articles, however, the researcher assumed that the abstracts accurately reflected the work under analysis for representation. This presented two other problems to work through: (1) deciding the causation when the abstract was not clearly written and (2) deciding what to do about articles whose abstracts only negate a cause without supporting another. This kind of lack of clarity has not been an issue. When an abstract addresses a specific etiology, it is generally to describe that etiology as the main cause of autism.

The bigger problem was abstracts that state only that autism is not caused by genetic or psychological (or other) factors, without saying what the etiology might be instead. It sounds contradictory, but the researcher opted to place these abstracts in the very category from which the authors were removing them as autism cause possibilities. The justification is that the research in each case was investigating the possibility. That the results cleared the category as a possible autism cause did not change the direction of and reason for each author’s research investigation. In any case, all but 14 of the sources that negated a cause offered another either directly or by clear implication. Fourteen is a little more than one tenth of one percent of the 1,259 abstracts in the sample, so it is unlikely that the issue was large enough to skew the results, or even change them.

The database developed documents where the article was published, not necessarily where the research was done or even the authors’ native countries. The format indicates whether the source is a journal article, editorial, book, conference proceeding, etc. The abstract source is the online vendor or bibliography in which the item is found, while the article source is the journal (or other source, i.e., the publisher for monographs) in which the item was published.

Another advantage to the use of this project-specific database was that it provided a fourth filter against duplicate entries in addition to (1) OneSearch, (2) the researcher counting the causes as they are characterized, and (3) the cause tally following each source or DIALOG file review. Although the intent of entering sources into this database was for easier manipulation, an immediate benefit was that it provided yet another chance to characterize an article or to filter out sources as duplicates or false drops.

Data Analysis

The intent of this research is to enumerate and map autism causation categories using graphs similar to those used for Bradford’s Law of Scattering. However, while Bradford’s Law tracks how the literature on a specific subject is spread over a number of journals, what this investigation seeks to map are the numbers—by cause specifically, but by year, too—for autism causation identified in the research over fifty-three years. The research hypothesis being investigated is that no single cause will dominate the literature (that is, be significantly greater than equal in number to the sum of articles about any other autism etiology). Certainly, no one cause should claim a majority of the abstracts used. What the researcher expects to find, therefore, is a relatively equal distribution of autism causation literature among a handful of possibilities. Preliminary reviews indicated that neurological, genetic, environmental, and psychological causations are the main four under investigation by autism researchers, so the numbers of articles for each of these areas should be nearly equal if no single causation dominates.

A related analysis is tracking the overall growth causal areas and any shifts in the direction of the research as evidenced by journal name changes or tandem year spans that indicate a jump from one specific autism cause to another (which can be a subtle but cogent statement of a shift in thinking). The utility of print versus online databases for this kind of research is also investigated. For example, the researcher makes note of print pointers to autism causation literature, such as those which make “autism” its own heading in cumulative volumes or those which include “autism” under an “etiologies” index (or vice versa). Discussion of Goffman’s epidemic theory is also incorporated.

Limitations of the Study

Sampling Limitations and Their Solutions

Being able to judge the causation category for the abstracts found presents issues more thorny than actually finding them. Two sampling issues to solve before collecting the data were: (1) overlap of the four categories and (2) incorporating other than English-language autism causation research. These issues are addressed in this section. In the interest of clarity and realization of time constraints, the profiles of the research literature and reporting of results were limited to cause suggested, format, author, country of publication, year of publication, article source, and abstract source. Limiting the search to causes—rather than treatments or “cures”—further controlled the sample size.

Overlap of categories was the most unclear issue. Since autism is by definition a brain disorder, even genetic and environmental causations manifest themselves as brain defects. All causes seem to overlap at the neurobiological. This could have been a hair-splitting exercise at best, or (at worst) a judgment call this researcher does not have the qualifications to make. The research analysis avoided this, however, by concentrating not on the nature of autism but on the characterizations offered by the literature about autism causation research. For example, in the abstract mentioned under “Data Collection” that describes the 11.4 percent incidence of autism among the children of 70 cocaine-addicted mothers, the autism etiology was characterized as “environmental” because the perinatal exposure to cocaine was the emphasis of the abstract. Had the authors concentrated on the brain damage involved, the abstract would have been characterized as “neurological.” If the authors had developed both aspects, the abstract would have been classified as both “environmental” and “neurological.”

Since the researcher has no effective secondary languages, this investigation attempts international inclusivity by using foreign articles with abstracts translated into English in the bibliographies already mentioned, mainly those published in the United States of America. There are specific English language versions of African Index Medicus, Index Medicus Israeliticus, Index Medicus for WHO [World Health Organization] Eastern Mediterranean Region, Index Medicus for WHO South-East Asia Region, a South Korean Index Medicus, and an Index Medicus Latino-Americano. However, these are better used for follow-on, expanded research. These are available at the National Library of Medicine, in Bethesda, Maryland, in person (or by the volume through interlibrary loan). Also, since they are not easily accessible, avoiding the use of English language bibliographies published outside of the United States will more readily facilitate replication of this investigation.

Methodological Limitations and Their Corrections

Limits to the research methods employed included: the uncertainty of “equal” weighting, the issue of validity, and personal “interference.” Deciding what constitutes a single bibliometric unit was one issue to solve. Any document expressing and supporting a view on autism’s cause appears to be the baseline. However, should a book carry no more weight than an article or, even more important, an abstract of an article? Or if the book presents and defends various causes, should it count multiple times in the research? The short answer is “yes,” it should. Each cause presented and/or researched has equal weight with all others. While this thesis is marginally interested in distinguishing among the sources, it is primarily concerned with counting causes, not sources.

Assuming the data collection is straightforward and the research identifies the four causal groups expected—neurobiological, psychological, genetic, and environmental—and that their numbers are relatively easy to tally, will these numbers measure precisely what the research intended? Would the fact that a large percentage of articles point to a genetic cause, for example, mean that this cause of autism is more likely to be the the “true” cause? While this outcome would be desirable, it does not account for factors such as the politics of research funding. The bibliometrics may look good on paper, but the research itself would not reveal the politics behind the numbers, making it difficult to draw these kinds of conclusions. However, these issues are beyond the scope of information sciences, and the research being described here intends to draw bibliometric conclusions only.

The last of the methodological limitations is a personal one. The researcher’s two children are autistic, a fact which may have biased him if not in favor of a genetic causation, then at least away from theories his everyday experience has not supported. However, mitigating this bias is the researcher’s years of reading what has been written about autism and his personal experience with many individuals with autism, through which he does realize that given the variety of autism’s manifestations, anything is possible. That this is a statistical, bibliometric investigation should further mitigate this emotional bias.

Although the issue of what is actually being measured has been sublimated as being beyond the scope of the work undertaken, there are methods to equalize the differences in the amount of information given from one source to the next and to guard against researcher bias. The sources that provide a 100 to 250 word abstract make it easiest to gauge the causes presented. However, there are other cues to use when examining a source or citation. The title of the work, keywords accompanying the citation (useful when the vendor gives little or no abstract), and the title or known focus of the journal which published the article yield reliable measurements. As the researcher could use the combination of this source-surrounding information to characterize the autism causation being presented, the source was added to the data. If not, it was considered a false drop. A danger in this is that sometimes the accompanying information is misleading, especially titles alone. However, these characterization-supporting factors were not considered in isolation, only in combination with each other, so their potentially misleading information did not influence the majority of the data (that is, the clearly stated autism etiologies). For example, the journal article “Development of Object Relations During the First Year of Life,” by K.S. Robson had no accompanying abstract. However, it appeared in Seminal Psychiatry (in November, 1992, volume 4, pages 301-16) and the MEDLINE descriptors used include “child development. . .personality disorder—etiology. . .autism, infantile—etiology. . .avoidance learning. . .[and] maternal deprivation.” These cues—journal, title, and descriptor taken together—account for the researcher’s “psychological” characterization of the article. Another example is “Congenital Rubella and Autistic Behavior” by C.N. Swisher and L. Swisher. The indexing information for this article appeared in EMBASE without an abstract. That the source is the New England Journal of Medicine (1975, volume 198, pages 293-4) is not a useful characterization tool, since this journal publishes a wide variety of medical articles. However, the subject headings (Emtags) include “infectious diseases. . .etiology. . .virus infection. . . congenital infection. . .[and] infantile autism.” The identifier is “autistic behaviour; relationship with congenital rubella.” The section headings include “virology,” “epidemiology,” and “rubella virus.” These pieces of information taken together are enough to justify an “environmental” characterization. If it were known that Robson always wrote about psychological matters or that Swisher and Swisher’s work is often associated with congenital defects, these would have aided in the characterization as well.

Three other biases had to be guarded against vigilantly because they could have had a detrimental effect on the numbers. The researcher is predisposed to neurological causes, is biased against psychological causes, and tends to favor works from countries other than the United States. When the cause seemed neurological, or neurological paired with another cause, the researcher read these abstracts more than once to ensure they were indeed presenting causal research and not merely supplying general neurological information. Guarding against the psychological bias required almost the opposite tactic. Especially for the early years of autism research, even if it seemed from the abstract that only general psychological information was being relayed, the researcher read twice to determine if the abstract could be labeled “psychological” as a legitimate presentation of an autism cause. The bias for foreign country inclusion as a subtle way of forcing the issue that autism is an international problem was easier to guard against. The researcher simply began to make the characterization before looking at where the work had been published.

Methods Summary

This study design chapter includes research methods used, describes how the sample was selected and where it was selected from, and lays out the procedures used for data analysis. Several sampling and methodological limitations to the work at hand are described, along with examples and plans to work around or overcome these obstacles.

The next chapter applies the methods given above to the data collected and describes the results obtained.

On to Chapter 4 or
Back to the Autistic Family's Main Page

This page is maintained by Jeff and Cathy Romanczuk, lukate@charter.net

This page is maintained by
Jeff and Cathy Romanczuk, lukate@charter.net