Electronic publishing in its context

André HECK

Observatoire Astronomique
11, rue de l'Université
F-67000 Strasbourg
France

Abstract

Contextual aspects of electronic publishing (and more generally of diversified publishing) are discussed. Definitions and concepts are introduced. Pending issues and challenges are identified. The accent is put on the need for providing authenticated and validated information. The electronic medium is a new medium per se that will exist together with other ones, such as paper, but it will call for specific procedures, strategies and policies. A lot has still to be done on the human level.

Introduction

Electronic publishing (EP) is slowly taking shape. The technology is there, certainly evolving rapidly and will progress still further in the future, but a number of factors have dilatory effects and are sometimes underestimated. If it is obvious that advantage has to be taken of the technological advances, it is not yet quite clear for many of us what will be the exact impact of these, nor in fact whether all the potentialities at hand are well understood. Human inertia is also a very limiting agent and, speaking of scientific communities, the traditional procedures and the habits progressively adopted over the decades (if not centuries) are difficult to alter. Human nature is simply and basically reluctant to change, especially when the final outcome is not well or not fully perceived.

The physics and astronomy communities have been among the first involved in EP, even before the concept itself existed per se. Astronomers, space physicists, high-energy physicists and their colleagues around the world have done more than just help in setting up the Internet and the associated networks. They jumped onto the World-Wide Web (WWW) and quickly became prolific producers and eager consumers of its resources (Hardin 1993).

As scientists, our ultimate aim is to contribute to a better understanding of, stricto sensu, the universe (as well as of its past and future) and consequently to a better comprehension of the place and rôle of man in it. To this end, together with theoretical studies, we carry out observations to obtain data that will undergo treatments and studies leading to the publication of results. The whole procedure can include several iterations or interactions between the various steps as well as with external fields, non-scientific disciplines, instrumental technologies, and information handling methodologies

Electronic information handling is a broad and flexible concept with plenty of degrees of freedom, adapted to the fluid and living nature of today's information material. It encompasses data collection, analysis, dissemination, and so on, as well as publishing (classical or otherwise). All these elements cannot be dissociated from each other as the electronization has facilitated the various interactions upstream and downstream, the ultimate step being a diversified publication of the final results. The classical scheme involving authors, editors, referees, publishers and readers - or, more generally speaking, information providers and users plus intermediaries - is also changing and can become very complex if including authentication and validation loops. It should be kept in mind that publishing is not only motivated by information sharing, but also strongly conditioned by career constraints.

There have not been many dedicated conferences on electronic publishing. In the general field of science, ICSU Press and UNESCO have to be credited with an impressive meeting (Shaw and Moore 1996; Shaw 1997). There have been also a number of events in the rather compact and well-structured astronomical community (see e.g. Heck 1992; Heck & Murtagh 1996) [1].

Definitions and concepts

In the following, information will be considered as what is communicated by others or obtained from investigation, study, or instruction. It covers the observational material, the more or less reduced data extracted from it, the scientific results, as well as the accessory material increasingly used by scientists in their work (bibliographical resources, yellow-page services, software libraries, and so on).

While publication will be considered as a public announcement (no implicit assumption being made as to the medium used), communication will be taken as the act or action of imparting or transmitting (while respecting constraints such as proprietary rights and so on). These definitions correspond to widely accepted concepts in information sciences, but the meaning of the terms could be different in fields such as marketing or advertising. Of course, it should also be kept in mind that data analysis, information sharing and related activities should never be an end per se as science must remain the main objective.

The structure itself of information has become different: beyond the classical quasi-linear layout of publications on paper, electronic documents include hypertextual links, the structure of which is more closely adjusted to the mental structure of many people.

The information material as a whole is now existing in an increasingly distributed way. Data centres have seen their rˆole evolving and tend now to act more as hubs towards distributed specialized repositories of different types of data and material (rather than, as in the past, holding as much as possible themselves and carrying out the integration work on their very location). This is thanks largely to the fact that the evolution of information technology has brought major modifications in relation to hardware and connectivity as well as new tools (client/server facilities, WWW browsers, resource discovery packages, ...) and concepts (hypertext/hypermedia concepts, virtual libraries, ...).

Finally, we have now entered for good the era of fluid information, i.e. a material that can be continuously updated, upgraded, enlarged, improved, modified, and so on. This new concept implies those of document (in)stability and of document genetics: beyond its own permanent possible evolution, a document can give birth to subsidiary ones, first linked to itself; the relevance of some of these can then supplant with time that of the original document that would virtually `die'. Forgetting this fluidity would be equivalent to staying with CD-ROMs, which are frozen repositories of fixed information and sometimes remain short of answering adequately some of our current needs.

The new medium

The emergence of the electronic medium is currently best represented by the WWW (but what will it be to-morrow? microelectrodes linked to a bio-cyberspace? - see e.g. Gibson 1986 & 1993). The WWW is based on hypertext and hypermedia. It has become, with unprecedented speed, a magnificent communication tool that has been called the `fourth media' and which is de facto a fantastic cross-disciplinary, cross-educational and cross-social meeting ground allowing exchanges on a new dimension. It is a highly dynamic domain, evolving rapidly.

Each of us has become an actual or potential author-creator of electronic documents acquiring ipso facto very rapidly an extremely high visibility, well beyond the horizon traditionally reached in specific circles. This is specially due to the tremendous efficiency of the formidable search tools available on the web. One must then be fully conscious of this and prepare in line with the consequences (i.e. with ad hoc caution and ethics) any document to be published.

The best search engines include Yahoo (URL: http://www.yahoo.com), Lycos (URL: http://www.lycos.com), and Digital's Alta Vista (URL: http://www.altavista.digital.com) which has our preference. An example of resources with search engines more specific to astronomy and related fields is described elsewhere in this volume (Heck 1997).

The explosion of electronic documents is however not without bringing in new questions, new challenges and new problems that will have to be faced, especially on the ethical, legal and educational levels, without forgetting the security nor the fragility of the material delivered on the electronic medium. We shall mention only a few specific points here. The interested reader will find more details in Heck (1995 & 1996) and in the references quoted therein.

Diversified publishing

Flexible publishing or, as we prefer calling it, diversified publishing implies that we shall be able to go to any media we like (WWW, CD-ROM, paper, and so on) in a hopefully automated fashion. The road to achieve this fully requires i.a. to make producing multimedia faster and more efficient.

But there are still too frequent timorous and/or conservative attitudes in view of what is possible with the current development of technologies and methodologies. Too many people remain short of the potentialities of the new medium and see still electronic publishing as little more than putting on line an electronic version of something that is existing also on paper.

Do not misunderstand the above though. Putting on line a printed document is not wrong, but this is by far insufficient. Why? Simply because the electronic medium is exactly what that means, a new medium per se, complementary to the existing ones, and because its usage should imply - and even require - dedicated techniques, policies, and strategies.

This is such an obvious statement that it probably does not need lengthy illustrations. Comparisons are often drawn with the advent of radio, or better television. The introduction of a new medium does not lead to the disappearance of former ones (in this case, newspapers and magazines). It calls however for a specific approach tailored for it in the same way that, on TV, they do not zoom in on newspapers or broadcast people reading magazines.

It is obvious however that electronic-publishing policies have not yet reached a final degree of maturity in spite of the fact that some of them are already quite elaborated (see various chapters of this volume and the references therein). Few of these policies go beyond an `electronization' of a paper document and of the previous procedures used to deal with it. Certainly these are made faster and more flexible, but still fall short of satisfactorily providing a solution for the fluid nature of today's information and the living character of information retrieval on the WWW.

It must be clear that, wishing to maintain at all costs a compatibility between PostScript, PDF and HTML (something that is repeatedly announced in publishing ventures) would prevent taking advantage of the hypertextual structure, sound, motion, applets and whatever may come next, available on the electronic medium.

Thus the often-heard opposition between classical publishing on paper and electronic publishing is unfounded, since, as explained above, the new medium is complementary of existing ones. The latter will have to adjust themselves to the arrival of the newcomer (as newspapers and magazines had to do for television), but there is no reason for publications on paper to disappear.

Validation and authentication

It is also obvious that learned societies, funding organizations, expert committees and other bodies will have to integrate the diversified-publishing productions into their evaluation procedures, that is the assessment of activities, plans and projects of individuals and organizations, implying a revision of the corresponding practicalities for recognition.

Indeed the phenomenology of publishing is not only motivated by the need of sharing information, but also strongly conditioned by recognition, a necessity that should not be underestimated and which is largely based on publications in refereed journals. This is sought for getting positions (grants and salaries), for obtaining acceptance of proposals (leading to data collection), and for achieving funding of projects (allowing materialization of ideas).

This implies of course another step: the adaptation of validation procedures to electronic material (`refereeing' it) and of measures for guaranteeing subsequently its integrity (see below). Reliable validation procedures are more than ever necessary because it has become increasingly difficult to distinguish between the so-called grey literature and the formal one.

Additionally authentication of originators (authors, institutions or organizations) might become an increasingly critical issue with the electronic material, as well as its importance itself being increasingly acknowledged.

A phenomenon that has to be appreciated is that, independently from any validation procedure, servers and web documents of persons and organizations with profile and reputation will be visited regularly with preference and confidence, so disrupting the current chronology of preprint-submission-publication.

One could even wonder whether servers of preprints and proceedings of conferences, not to forget those of personal documents and productions, will not take over if the procedures of the learned societies, the commercial publishers and other traditional channels remain slow and heavy, failing to respond to the dynamism, the fluidity and the visibility available via the electronic vector.

Mentalities, habits and policies will have to adjust themselves progressively, with the usual delayed reaction time resulting from natural human and social inertia, coupled to a certain reluctance, if not sometimes a definite distrust, towards the new medium, linked to the fragility of the electronic material and to the alterations it could easily undergo.

Maintenance for quality

The maintenance process of information resources must be continuously improved from lessons learned with time and by using the most appropriate tools. Generally speaking, information has to be collected, verified, de-biased, homogenized, and made available not only in an efficient way, but also through operationally reliable means (it becomes useless if plugged into a confidential network or reachable through deficient routers). Redundancies have to be avoided; precision is, and details can be, extremely important.

If scientists have a natural tendency to design projects and software packages involving the most advanced techniques and tools, there is in general less enthusiasm for the painstaking and meticulous long-term maintenance which builds up the real substance of the quality resources. This has also to be carried out by knowledgeable scientists or documentalists and cannot be delegated to inexperienced clerks or temporary employees, since the necessary experience is long and slow to be acquired.

Information retrieval per se is raising a number of evaluation issues (see e.g. Harman 1992 and the subsequent papers of the corresponding special issue). The fashion is now shifting towards designing and experimenting with quality control processes. This might be a very serious matter or a big joke. None of the algorithms currently available has really convinced us of their absolute necessity and satisfactory efficiency.

Such a short time after its birth, the web needs already a good cleaning, so numerous are the anchors pointing towards inexisting documents and so many are useless or obsolete documents on the various servers worldwide. It is certainly up the `webmasters' to fix this and to prune the dead wood from their respective sites.

The stability of sites and URLs is very important. Also URLs should not be modified unless for good reasons. Forwarding pointers should be put in place, failing what it could be impossible to keep track of the moving sites.

Ethics, security, and copyright

At a time when authors/creators of electronic documents are increasingly worried about the easy possible alteration of their work, proper credit to the material used should always be clearly indicated. Since the browsers make it so easy to download the original files, crediting the sources appropriately becomes critically important. It is also smarter and more elegant to insert a hyperlink to the original document since it will point then always to the freshest version of the file.

This brings us to security issues, involving monitoring the visits of the server, restricting access to some documents, preserving the confidentiality of others, and so on. Away from governmental policies (such as the Clipper chip project in the US that has been raising substantial controversy or the current ban on encryption in France), there is no golden rule on security issues.

It is up to each local `webmaster' to set up appropriate security. Some resources require ad hoc clearance (password, account number and so on); others will be only partially retrievable in a specific query (such as large copyrighted databases); finally, other documents are freely accessible and usable, conditioned on a minimum of ethical behavior (see above). Electronic commercial transactions implying transfer of funds (including using a credit card number) have initiated a number of elaborated procedures (at the limit of paranoia) trying to prevent in advance all potential tricks from computer gangsters.

It is also not excluded that an astronomical intranet will have to be set up at some stage in order to make more efficient the exclusively professional exchanges.

Legal aspects (copyright, electronic signatures, ...) are also extremely important and jurists are busy setting up references for the computerized material. Particularly in this case, there might be variations from country to country when the law already exists. However, with the world globalization of electronic communications, one can expect - and hope for - quick harmonization of the various references and procedures. On such matters, refer e.g. to specific chapters in this volume as well as to Samuelson (1994 & 1996) and, more generally, to her very interesting regular column `Legally Speaking' in the ACM Communications.

Some jurists are still discussing whether offering material on-line, displaying material on screen, and storing material temporarily are acts subjected to the usual copyright protection.

As a rule however and from a recent experience, authors should be very careful with what they sign in terms of copyright transfer agreement and possibly refuse to transfer any copyright at all. If they are not careful enough, they might hand over much more than they would normally think, up to a total exclusivity in favor of the publisher which would prevent even a posting on a personal WWW site. The only place where the contribution could be found would be the publisher's server, quite likely against payment. This would be another example of the fact that `copyright' does not mean a protection for the author, but rather for the publisher, restricting very seriously the authors' right on - and personal promotion of - their own work.

Minimum rights for the authors have to be protected. It is thus important to request that all restrictions (if any) that would apply in terms of electronic distribution and postings of papers be precisely specified in the copyright transfer agreement itself, as well as in the instructions for authors. Guest editors should also see this mentioned in their contracts with the publishers. These conditions should not change or evolve during the publishing process, unless by mutual agreement.

One might have differing opinions on the rôle of publishers [2] and on the financial implications of the publishing process. From all the experience built up through our activities in the field of electronic publishing and information handling, our own stand is that, at the very least, an author should always be allowed to post his full papers on his own WWW site. Collaboration with publishers preventing this should be seriously questioned.

With the advent of the electronic era, scientists and scientific institutions have now all the possibilities to run, if necessary, an efficient information server with validated (refereed) material without the help of a commercial publisher. Ginsparg (1996) actually concurs on this: ``A correctly configured fully electronic scholarly journal can be operated at a fraction of the cost of a conventional print journal, and could for example be fully supported by author subsidy (page charges or related mechanism, as already paid to some journals), ideally allowing for free network distribution and maximal benefit both to authors and readers."

The technical expertise is there, as well as a network of referees already functioning on a volunteer basis (and increasingly via electronic transfers), in such a way that only limited funding will be necessary for the clerical work of the editors and a minumum of equipment. Of course, a number of precautions will be necessary such as mirror sites, guarantees for the permanency of service and the integrity of the refereed/validated material, links to possible updates and complementary information, etc. But there is certainly no major obstacle to providing an excellent electronic-publishing resource in the full sense of the expression.

And the human component?

Last, but not least, there are non-negligible educational aspects to be taken into account as to the introduction and training of young and not-so-young people to the new technologies within the various communities. This is true not only for scientists, but also for librarians and documentalists who will see their rôle significantly changing within their institution and who will be increasingly dealing with virtual material (see e.g. Brown 1996 and Grothkopf 1997).

We have already mentioned how it will be necessary to adapt our habits and policies to the new electronic medium, not only in everyday life, but also for proper recognition of the scientific activities, the results of which will quite naturally be expressed on the various media (and not only on paper). Educating all human components involved in the new concepts, capabilities, methodologies and technologies is a process that will also require patience, dedication and ... time, as technology leaders agree at least on one point: change is this only sure thing in the next decade.

As reminded recently by Gell-Mann (1997), with the digital age producing an `immense sea of data that threatens to drown humanity', people need to adapt how they think so that true knowledge can be distilled from the deluge. ``We hear, in this dawn of the so-called information age, a great deal of talk about the explosion of information and new methods for its dissemination. It is important to realize, however, that most of what is disseminated is misinformation, badly organized information or irrelevant information. How can we establish a reward system such that many competing but skillful processors of information, acting as intermediaries, will arise to interpret for us this mass of unorganized, partially false material?"

We have to rely indeed on the wisdom of providers and users of electronic information, as well as on the various intermediaries (compilers, information hubs, and so on) and on the learned societies, committee experts, and so on, to take this into account, to include quickly validated and authenticated electronic information in the evaluation processes, and to give to electronic publishing its deserved lettres de noblesse.

References

Notes

  1. Astronomy has also had regular conferences and reference publications on electronic information handling. Refer for instance to the various contributions in Egret & Albrecht (1995), Egret & Heck (1995), Heck & Murtagh (1993), Murtagh et al. (1995) and to the references quoted therein.
  2. We have had very differing experiences with the ones we dealt with.

Go back to the table of contents of the book
Electronic Publishing for Physics and Astronomy
Books main page.
Publications main page.
© Copyright André HECK, current year.