Information Handling in Astronomy

(André HECK, Strasbourg Astronomical Observatory)

In a science such as astronomy, information handling encompasses data collection, analysis and dissemination, as well as the way astronomers publish, interact and communicate, including with other communities, with amateur astronomers and with the public at large.

The main aim of astronomers is to contribute to a better understanding of the universe (as well as of its past and future) and consequently to a better comprehension of the place and role of humans in it.

To this end, together with theoretical investigations, they carry out observations to obtain data that subsequently undergo treatment and studies leading to the publication of results. The whole procedure can include several iterations between the various steps as well as with external fields, non-scientific disciplines, instrumental technologies and information handling methodologies.

In the following, the concept of information will cover the observational material, the more or less reduced data extracted from it, the scientific results and the accessory material used by scientists in their work (bibliographical resources, yellow-page services, and so on), as well as the communications and publications of all kinds.

It should also be noted that, contrary to other scientific disciplines, astronomy has the peculiarity of not being able to interact with the objects it investigates (except for very few bodies of the solar system where in situ studies can be carried out by spacecraft). Consequently all our knowledge of the universe has so far depended almost exclusively on the photons reaching us from outer space.

Astronomy has thus to rely on the ingenuity of instrumentalists to conceive and design a whole range of observing tools exploiting at best the latest technologies and the most sensitive detectors to obtain the most relevant and most varied information allowing progress of astronomical knowledge. The current trend is also towards panchromatic astronomy, i.e. combining information from the various wavelength ranges of the electromagnetic spectrum (radio, infrared, visible, ultraviolet, X, Gamma, ...), instead of restricting oneself to specific ranges as used to be the case in the past.

Compared with the past too, ever larger amounts of data are being collected, and the rate will probably continue to accelerate. Instruments such as NASA/ESA’s Hubble Space Telescope (HST) or ESO’s Very Large Telescope (VLT) are generating or will generate annually a volume of information of the order of terabytes of data. The rate of increase of observations is also matched by the diversification.

Until not so long ago, an astronomer could also work individually from the conception of a project through to the collection and analysis of data. Nowadays, as instrumentation has become more complex, teams of researchers, most often international ones, have become necessary and they quite naturally include technologists or instrumentalists.

At the other end of the chain, the teams are more and more including methodologists, i.e. specialists of information handling. If image processing is a natural consequence of a sophisticated technology, information handling rather qualifies whatever happens to already well-reduced data.

Astronomy and related space sciences have always been at the leading edge of new information technologies, often testing, contributing to and pushing for new developments, be it for the access to databases, the usage of networks or the initial explosion of world-wide web (WWW) sites, as well as for remote observing (spacecraft and ground-based telescopes remotely operated).

The recent dramatic evolution of communication and information technologies had a deep impact on the way the astronomy community interacts and works, in other words on its dynamics, and one can certainly expect more evolution in the future as the corresponding technologies will bring in new potentialities.

Information handling in astronomy thus reflects the way astronomers work and ensures progress of the astronomical knowledge which is then shared with colleague scientists as well as, on a less specialized level, with the significant community of amateur astronomers round the world (also a phenomenon proper to astronomy) and with the public at large since cosmic perceptions have always been a fundamental component of human culture and philosophy. Astronomers are of course also deeply involved in education at all levels.

As described in the following and as illustrated by figure 1, the information flow in astronomy is far from being a simple linear one. More and more of the processes and corresponding exchanges are performed electronically.

Figure 1
A schematic view of the information flow in astronomy.


Observing and collecting data

Carrying out professional astronomical observations implies having access to ground-based and space-borne instruments which is a highly, sometimes fiercely, competitive process. Candidates have to submit proposals requesting instrument or spacecraft time with specifications for secondary instrumentation, observing modes and configurations.

The cases must be made not only on the basis of the technical capabilities requested but also with strong references to the scientific achievements aimed at in the light, not only of what has already been achieved, but also of the expertise of the proposers themselves.

Expert committees review the technical feasibility of the proposals and the relevance of the scientific cases. They refer to the available literature (a reason for publishing, see below) and to the past observations carried out with the instrument requested in order to avoid duplication of activities of sometimes highly oversubscribed equipment (hence the need for observing logs, see below).

Oversubscribing factors vary with specific instruments or spacecraft (the new ones and the most efficient ones being generally more sought for), but it is not rare that between five and ten times more observing time is requested than is available. The task of the time allocating committees is therefore not an easy one.

Often astronomers, or even entire teams of astronomers, are encouraged to collaborate with each other, thus to share observing time and the resulting data. Astronomy is by essence an international science and it is quite common to find scientists from all over the world joining efforts in the same team (hence a need for yellow-page services, see below).

Incidentally, service observing (i.e. observations carried out, not by the proposers themselves, but by a team of trained resident astronomers and technicians) is becoming increasingly common in order to maximize return of heavy investments, to optimize instrumental pointing by combining lists of objects (‘targets’) of several programs and to minimize effects of some adverse factors such as seeing, weather, avoidance zones, solar activity, eclipses, radiation belts and so on.

Service observing also makes it easier to deal with contingency time and with targets of opportunity (i.e. unscheduled events such as discoveries of novae, supernovae, comets and so on). In such cases, quick and efficient interactions with teams of specialists are critical for making the best of such opportunities.

Nomenclature of celestial objects

Cataloging objects properly is a fundamental issue of all astronomical observations and studies. However, when dealing with a study field as populated as the universe, to unequivocally identify the objects at hand is something that is made complicated by several factors.

First of all, the relative positions of objects in the universe—thus in the sky—are not fixed, even if those relative movements are imperceptible to the unassisted eye and often on a human time scale (except again for the solar-system objects).

Second, and all amateur photographers know that effect, the sky looks differently if observed in different wavelength ranges: the infrared sky is not quite the same as the visible one which in turn differs from the radio one, and so on. This results from the fact that celestial bodies radiate the peak of their energy in different wavelength ranges (some stars are red, others are blue; some bodies are strong emitters in the infrared, others in the X range; and so on).

Finally, and maybe most importantly, it all depends at which sensitivity and resolution the sky is observed in a specific wavelength range. The more sensitive is the tool, the more crowded is the sky ‘seen’ in that range, and the more precise must be the corresponding identification of a given object.

It is not rare either that an object (star, nebula, galaxy,…) become resolved in several, sometimes many, components or elements when observed at a higher resolution. Hence the need for establishing some hierarchical relationships between the designations of those objects as seen in different conditions.

In an excellent environment (i.e. away from the light pollution of densely populated areas), the unassisted human eye can see about 9000 stars brighter than magnitude 6.5 for the whole sky. This corresponds roughly to the Bright Star Catalog (BSC) that provides basic astronomical information. The Henry Draper (HD) Catalog, that goes down to about magnitude 9 (remember that the larger the magnitude, the fainter the object), lists already more than 272 000 stars.

One of the first astronomical space experiments, the Belgian–UK Ultraviolet Sky Survey Telescope (S2/68) on board the TD1 satellite, led to a catalog of ultraviolet fluxes for 31 215 stars collected at the beginning of the 1970s. A decade later, the IRAS Catalog of Point Sources (alone) gathered together some 250 000 well-confirmed infrared point sources.

As to the HST Guide Star Catalog (GSC), brighter than magnitude 16, of which more than 15 million are classified as stars. A Hubble Deep Field (HDF) typically reveals about 1500 galaxies down to nearly magnitude 30 (i.e. nearly four billion times fainter than the human eye can see) in an area of a few arcminutes (equivalent to a dime seen at a distance of 75 feet). One thus realizes the challenge of unequivocally identifying objects in the sky as the instrumental sensitivity and resolution increase so dramatically.

The nomenclature of astronomical objects follows a number of rules that differ with the objects at hand.

Solar-system objects, including natural satellites, minor planets and comets, as well as planetary features, are named following recommendations of ad hoc professional committees confirmed by the International Astronomical Union (IAU).

Discoveries of new bodies are handled by the Central Bureau of Astronomical Telegrams (CBAT) and the Minor Planet Center (MPC) that assign provisional designations until ratification by the IAU (for example, lunar craters named after outstanding scientists, asteroids bearing names selected by their discoverers, and so on). Comets are nowadays the only bodies still receiving the names (up to three) of their discoverers, together with an alphanumerical identification. The CBAT also informs on discoveries of supernovae as well as other remarkable discoveries, events and celestial objects.

A whole book would be necessary to describe in details all the systems used to designate stars and non-stellar objects. If a few bright ones have ancient historical names (Sirius), other designations include constellation memberships (alpha CMa), memberships of declination zones (-18°1345), sequential discovery numberings (V341 Sco), references to wavelength ranges (Sco X-1), to object natures (PSR 0031-007), to observatories, instruments or spacecraft (Lick Halpha 52), to larger objects (NGC 125-6), to astronomer names (Barnard’s star), and so on, not to forget the most common identifications: sequence numberings in hundreds of astronomical catalogs and observing logs available. In other words, most celestial objects have just numbers and positions on the sky.

A few words are in order here on a fashion that has been developed by non-official organizations on naming celestial objects and selling corresponding certificates. The fact is that such names have no formal or official validity whatever. Astronomers and their representative organizations dissociate themselves entirely from such commercial practices of selling fictitious names of surface features, of stars and of celestial bodies.

The IAU is the sole internationally recognized authority for naming them. Those names are not sold, but assigned according to internationally accepted rules, recognized and used by scientists, space agencies and authorities worldwide.

Catalogs, surveys, observing logs and archives

As introduced above, catalogs can be organized according to object types (e.g. planetary nebulae, quasars, pulsars, …), according to the type of data they offer (e.g. proper motions, photometric indices, spectral types, …), or both. So many catalogs are available nowadays that it is out of the scope of this article to detail them. Refer, however, to the services maintained by the data centers (see below and the list of URLs at the end of the article).

Surveys refer more to systematic coverages of the sky with specific instruments, such as the famous Palomar Observatory Sky Survey (POSS), which was in fact a professional photographic atlas of the sky as seen in two colors by the Palomar Schmidt telescope.

Important catalogs nowadays are also what is called the observing logs gathering together the details of the observations carried out with a specific instrument (objects observed; date, time, duration of observations; instrumental specifications and/or configurations; and so on). They are in general linked to instrumental archives.

Archiving is also a critical issue in astronomy, basically because practically all celestial objects are variable with time in a way or another and also, as seen earlier, because we are essentially passive observers of the universe and not experimenters who would be able to recreate at any time situations for further investigations.

The memory of past observations is thus fundamental for future studies since furthermore it is not known for what purposes the next generations of astronomers would use the data. For a couple of decades now, the funding of all big new projects includes provisions for comprehensive image processing, adequate archiving and further exploitation of the archive after the possible termination of the mission. This might sound an obvious policy, but it is in fact a rather recent approach.

The challenge here is of a technical nature as the average lifetime of storage media is about 4 yr and about 3 yr for the user interfaces. Thus any long-term archiving policy has to make provisions for regularly transfering data on new material.

Dababases and information hubs

Beyond catalogs and collections of catalogs, extremely powerful databases are now available. The data centers themselves have become information hubs providing a whole spectrum of world-wide services.

The Strasbourg astronomical Data Center (CDS) (figure 2) has been a long-time pioneer and is nowadays recognized as the world leader in these matters.



Figure 2
A schematic view of the CDS information hub (courtesy Centre de Données de Strasbourg, CDS).


In the early 1970s, a number of European institutions decided indeed to collaborate and to create a data center at Strasbourg Observatory with the daunting task of establishing an enormous table of synonyms between all catalog identifications. This was definitely aimed at avoiding the repetition of a couple of situations where two qualified astronomers studied and published papers on the same object, but under different identifications and without realizing it.

However, more interestingly, this was allowing, with a single object name, access to all other identifiers, plus to the data listed in the various catalogs. As CDS also set up a comprehensive and very successful bibliographical object-oriented database, the references of the papers dealing with that object were immediately available by the same token. The popular Simbad database was born.

Simbad holds today more than 2 200 000 objects under about 5 500 000 identifiers, together with more than 105 000 bibliographical references including about 3 000 000 object citations. Anyone starting studying seriously an astronomical object must pay a visit to Simbad first.

Figure 2 shows other components of CDS’s information hub. VizieR is a search-and-shop individual-catalog service, with also access to large tables published in professional journals. Aladin is an interactive digitized sky atlas. Besides the references of Simbad, the CDS bibliographical service provides also access to abstracts of several major journals (see also ADS hereafter). The dictionary of nomenclature gives references and details on usage for more than 4000 different catalog acronyms. AstroGlu is a discovery tool helping to locate database servers providing relevant information.

Last, but not least, the StarPages are a set of permanently updated yellow-page resources providing detailed information on astronomy-related organizations (StarWorlds: about 6300 entries, more than 5500 WWW links), as well as on individual astronomers and related scientists (StarHeads: more than 5200 WWW home pages). They offer also access to an enormous dictionary (StarBits: about 140 000 entries) of abbreviations and acronyms related to astronomy and associated space sciences. The master files for StarWorlds have been used to produce figure 4 (see below).

Figure 4
Geographical distributions (North America, World, Western Europe) of astronomy-related organizations (all categories) with electronic facilities (e-mail and/or web pages) in April 1999.


A few other multipurpose resources deserve also a mention in this section.

The NASA/IPAC Extragalactic Database (NED) has been built around a master list of extragalactic objects for which cross-identifications of names have been established, accurate positions and redshifts entered to the extent possible, and some basic data collected. Bibliographic references relevant to individual objects have also been compiled.

The Astrophysics Data System (ADS) is a NASA-funded project whose main resource is an abstract service (about half a million abstracts for astronomy and astrophysics only), together with links to scanned images of over 40 000 journal articles. ADS provides also access to astronomical data catalogs and data archives, thereby making data collected by NASA space missions available to astronomers. It offers also access to the StarPages.

The National Space Science Data Center (NSSDC) provides access to a wide variety of astrophysics, space physics, solar physics, lunar and planetary data from NASA missions, together with some complementary data.

The Canadian Astronomy Data Centre (CADC) and the Astronomical Data Analysis Center (ADAC) at the National Astronomical Observatory of Japan (NAOJ) are national resources offering access to a number of catalogs, databases and archives.

Data processing

Until not so long ago, data processing consisted essentially in photographic-plate scanning and in reduction of raw photometric data from punched papertapes.

Nowadays, with the omnipresent digitization and multidimensionality of collected data, one rather speaks of image processing and a whole treatise could be devoted to it.

Indeed the time seems definitely gone, at least with the big instruments, when astronomers were coming themselves to the telescope (or ground observatory for a spacecraft) and later returning home with unprocessed data. The advanced experiments have developed their own specific image processing which is now part of the projects themselves since the development phase.

Several comprehensive image handling systems have however been developed such as ESO’s Munich Image Data Analysis System (MIDAS) and NOAO’s Image Reduction and Analysis Facility (IRAF). They can be considered as general-purpose software systems for the reduction and analysis of astronomical data and are regularly upgraded.

Complementary resources would include specific software packages and libraries (such as statistical ones), spectral line compilations, abundance libraries, opacity tables, and so on. A network such as Starlink helps UK-based astronomers to reduce and analyze their observations.

It is certainly worthwhile to mention here that astronomers have introduced the Flexible Image Transport System (FITS) which is a general way to encode both definitions of data and the data themselves and which is machine independent. The FITS format has been quickly recommended for interchange of image data between all observatories—and is now in use even outside astronomy.

Publishing and information sharing

Publishing is not only motivated by the noble aims of educating and information sharing but also strongly conditioned by career constraints involving recognition, a necessity that should not be underestimated. Recognition is sought for getting positions (i.e. grants and salaries), for obtaining acceptance of proposals (e.g. leading to data collection) and for achieving funding of projects (allowing materialization of ideas).

The pressure for recognition has contributed to the strong increase of professional publications (see figure 3), together with other factors such as the expansion of the astronomy community itself (especially after the beginning of the space age), the multiplication of large instruments and spacecraft equipped with always faster, more diversified and more efficient detectors, and so on. Commercial publishers have also put on the market more journals which were as many additional communication outlets.

Figure 3
An illustration of the dramatic increase of astronomical literature over the past decades. Helmut A Abt, Editor-in-Chief of the Astrophysical Journal, is standing next to stacks of that leading professional publication (courtesy the National Optical Astronomy Observatories, NOAO).


The major professional journals use the peer-review procedure (‘refereeing’) for accepting, amending or rejecting submitted contributions. Albeit a matter of regular debates (on the principle itself as well as on the way it is conducted), the refereeing process has been so far the best one (or the less questionable one) to publish contributions with validated content, i.e. an assurance of good-quality, novel results obtained by reproducible experiments, calculations or analyses on which enough details are provided.

The most important general professional journals include the Astrophysical Journal and the Astronomical Journal published by the American Astronomical Society, the Monthly Notices of the Royal Astronomical Society, and Astronomy and Astrophysics resulting from the merging in 1969 of a number of European professional journals.

Astronomers communicate also via a whole spectrum of publications ranging from informal newsletters to books gathering together review papers by the best specialists on specific topics. Conferences, colloquiums, workshops and meetings of all kinds provide also efficient ways of exposing oneself to both excellent review talks and presentations of works in progress. The corresponding proceedings are published by commercial publishers, by learned societies, by research institutions or even by individuals, reasonably soon after the events.

As described in the following section, publishing is nowadays increasingly done electronically, or, better said, there is more and more of diversified publishing, i.e. of information available on different media (paper, CD-ROM, web sites, and so on). These media are not excluding, but complementing, each other. Several journals have an electronic counterpart.

As mentioned earlier, professional astronomers are also contributing substantially to less specialized publications, mainly directed towards amateur astronomers and the public at large. Many countries have their own such national journal, but Sky & Telescope is probably the magazine with the largest audience world-wide. Electronic astronomy

As mentioned already, more and more of the information exchanges in astronomy are done electronically, both dynamically (e-mail) and passively (web sites). Electronic handling is particularly well adapted to the fluid and living nature of today’s information material, to the digitized nature of most data and to the efficiency of contacts between collaborators spread over the world, as well as to the retrieval of information from the information hubs. It also makes easy the various interactions upstream and downstream, and allows diversified publishing.

Figure 4 gives an idea of the distribution, at the time of writing this article, of the astronomy-related organizations using e-mail and/or having a web site. Together with the general world distribution, more detailed views are displayed for North America and Western Europe.

The largest concentrations are located in Europe and in the USA (Northeast and California), with a few nuclei in Australia, India, Japan, New Zealand, as well as a few spots in South America. A striking feature of the central map is the desperate emptiness of the African continent. A similar comment is also applicable to quite a number of the so-called third-world countries.

In Western Europe, it is striking how France, Portugal and Spain have significantly much lower densities than their European neighbors, obviously lagging behind as to the penetration of the internet and the WWW, a few years after the electronic medium started spreading quickly over the world.

The ‘electronization’ of astronomical information handling is, however, not a bed of roses. New facilities and new possibilities bring in naturally new questions, new challenges and new problems that have to be faced, especially at the ethical level (proper credits of downloaded material, for instance), as well as at the legal (financial and copyright policies, electronic signatures, for instance) and educational (training of young and not-so-young people) ones, without forgetting the security and the fragility of the material delivered via the electronic medium—a very worrying factor for many people.

The most demanding challenge, however, arises probably from the enormous quantity of information easily reachable (and providable) by everybody nowadays, as discussed in the following section.

The information retrieval challenge

At ACM97, the conference celebrating the 50th anniversary of the Association for Computing Machinery, the Nobel Prize laureate Murray Gell-Mann called attention to the fact that, with the digital age producing an ‘immense sea of data that threatens to drown humanity’, people needed to adapt how they think so that true knowledge can be distilled from the deluge.

‘We hear, in this dawn of the so-called information age, a great deal of talk about the explosion of information and new methods for its dissemination. It is important to realize, however, that most of what is disseminated is misinformation, badly organized information or irrelevant information. How can we establish a reward system such that many competing but skillful processors of information, acting as intermediaries, will arise to interpret for us this mass of unorganized, partially false material?’

That challenge is probably the most critical one currently in the field of information handling—and astronomy is facing it too.

We shall have to rely, indeed, on the wisdom of providers and users of electronic information, as well as on the various intermediaries (compilers, information hubs,…) and on the learned societies, committee experts and so on to take this into account, to put, more than ever, the emphasis on validated and authenticated electronic information, and to reward appropriately the scientists who are or will dedicate a full-time job to such activities.

The quality of resources as well as their maintenance must be continuously improved from lessons learned with time and using the most adequate tools. Generally speaking, information has to be collected, verified, de-biased, homogenized and made available not only in an efficient way but also through reliable channels. Sophisticated techniques cannot save the extensive background, unrewarding and very careful work which is indispensable for the compilation of a valuable resource. One could never stress enough the importance of this obscure daily work consisting of patiently collecting data, checking information and updating it. This has also to be carried out by knowledgable scientists or documentalists and cannot be delegated to inexperienced clerks.

Efficient search engines working on validated and authenticated material must enable finding information looked for, if available at all, and whether it relates to celestial objects, to data, to bibliography or to the vast coverage by yellow-page services. That information must be of good quality, relevant and on target. The best search engines available today and the corresponding organizations have been mentioned in this article.

As technology leaders agree that changes are the only sure thing about computing and communications in the next decades, we can certainly expect further modifications on the information handling in astronomy and consequently on the sociodynamics of the astronomy community itself.

Useful URLs

For WWW access, here are—in alphabetical order—the URLs of the most relevant organizations and resources mentioned in the article. Most of them are collaborating with each other and are hosting mirror pages of their partners. Complementary resources are also generally available from the sites.

American Astronomical Society (http://www.aas.org/)

Association for Computing Machinery (http://www.acm.org/)

Astrophysics Data System (http://adsabs.harvard.edu/)

Canadian Astronomy Data Centre (http://cadcwww.dao.nrc.ca/CADC-homepage.html)

Central Bureau of Astronomical Telegrams (http://cfa-www.harvard.edu/cfa/ps/cbat.html)

International Astronomical Union (http://www.iau.org/)

Minor Planet Center (http://cfa-www.harvard.edu/iau/mpc.html)

NAOJ Astronomical Data Analysis Center (http://adac.mtk.nao.ac.jp/)

NASA/IPAC Extragalactic Database (http://nedwww.ipac.caltech.edu/)

National Space Science Data Center (http://nssdc.gsfc.nasa.gov/)

Royal Astronomical Society (http://www.ras.org.uk/ras/)

Simbad (http://simbad.u-strasbg.fr/Simbad)

Sky & Telescope (http://www.skypub.com/)

Starlink (http://star-www.rl.ac.uk/)

Strasbourg Data Center (http://cdsweb.u-strasbg.fr/CDS.html)

Bibliography

Because of the quick evolution of information technologies nowadays, and because of the impact they have on the dynamics of the astronomy community, some aspects of the most recent compilations and reviews could become rather quickly outdated. The basic principles will largerly remain unchanged, however, and the following books could be recommended as possible further, more technical, reading:

Series of specialized conferences provide also sources for advanced material. See for instance the following as well as the numerous references quoted therein:


Paper published in the Encyclopedia of Astronomy and Astrophysics, Ed. Paul Murdin, IoP Publishing/Nature Publishing Group (Jan. 2001), pp. 1200-1207
© Copyright André HECK, current year.