Proceedings of the

Second ICSU-UNESCO International Conference on

Electronic Publishing in Science

held in association with CODATA, IFLA and ICSTI
at UNESCO House, Paris 20-23 February 2001

received 4 January 2001

New practices for electronic publishing: how to maintain quality and guarantee integrity

Joost G. Kircz
KRA-Publishing Research
Prins Hendrikkade 141, 1011 AS Amsterdam
and Van der Waals-Zeeman Instituut, Universiteit van Amsterdam
kircz@kra.nl
www.science.uva.nl/projects/commphys

1. Introduction

At present, we are witnessing a change from paper to electronic media for the storage, dissemination and handling of scientific articles. In practice, this often means only a change from one carrier to another. Most electronic publications are simply paper products transposed onto electronic media. Neither the structure, nor the way language is used, is significantly different from earlier practice.

Nevertheless, we witness a sometimes heated debate on the value of such "electronic documents". In my view, we have to make a difference between documents that look, smell and sound like a paper document but are stored and transmitted by electronic means, and documents that are originally created for an electronic environment, and hence are new animals in the zoo of scientific communications.

The discussion on the value of electronic documents is often hampered by the fact that one starts from what one is accustomed to in the paper world and attempts to impose that on an electronic environment. The scientific paper as we know it is a paper-based object that obviously can be cast into various technical forms, but intrinsically remains a paper object.

In order to grasp the impact of the current electronic revolution, as well as being able to set out a policy towards the future, we have to abstract from the presentation form and start with the aims and content of scientific communication before we zoom in on a particular presentation form.

We have to step back and analyse what it means to write for an electronic medium and what it means to read material that is stored electronically. In a paper world, writing and reading are very close. Writing for an electronic medium, means an understanding of the full capacities the medium contains. Reading electronic articles on the other hand doesn’t mean reading from a screen. The presentation becomes flexible! In contrast to paper, the electronic media allow a distinct difference in presentation between the author’s favoured presentation and the consumer’s reading practice.

An electronic document is not the electronic version of a traditional paper document with embellishments such as hyperlinks, colour pictures and illustrative animations. An electronic document is a document comprising a variety of different types of information presentations that are brought together by an author in order to present a comprehensive scientific argument. Or to put it in other terms: in an electronic publication, images, animations and so on cease to be illuminating illustrations to the text, but are now semi-independent knowledge representations that together with the text comprise the scientific argument communicated to peer scientists.

In order to develop new insights in an editorial policy that maintains the essential virtues of the paper document as well as incorporates all the new exciting features, I will firstly discuss the scientific paper as we know it. Subsequently, new ways for knowledge expression are dealt with. In the concluding section, I try to set out some guidelines for the coming period.

2. What is a scientific paper?

For the purpose of this conference, it is not necessary to dwell at length on the coming into being and practise of present day scientific publishing. The reader is referred to Garvey’s book Communication: the essence of science (Gar79) and the more recent book by Meadows Communicating Research (Mea98) and references therein. A good starting point for our discussion is the report of an International Working Group (IWG99) based on a Workshop organized by the AAAS, ICSU Press and UNESCO and published under the title Defining and Certifying Electronic Publication in Science. A proposal to the International Association of STM Publishers.

The necessity of a clear understanding of what a scientific publication actually is, is well formulated as: Publication is the hard currency of science. It is the primary yardstick for establishing priority of discovery, making the status of a publication a critical factor in resolving priority disputes or intellectual property claims. Academic tenure and promotion decisions are based in large part on publication in peer-reviewed journals or scholarly books. To make these decisions fairly and with confidence, scientists and their institutions need assurance of what counts as a legitimate electronic publication.

Thus, the challenge is to ensure that, independent of the technology used, the use and exchange value of this type of currency can be established universally for all participants in the world of science.

The Working Group proposes a list of minimum characteristics to qualify a document as a "publication". It is worthwhile to confront this list with, on the one hand, the expansion of the concept document to all coherent knowledge presentations being textual, non-textual or a mixture, and on the other hand the list of communication needs presented by Kircz and Roosendaal (Kir96). This list of communication needs reads: 1) awareness of knowledge, 2) awareness of new research outcomes, 3) specific information, 4) scientific standards, 5) platform of communication, and 6 ) ownership protection. It is immediately clear that scientific communication needs as such encompass a much wider range of interaction between scientists than formal publications.

The Working Group makes a useful distinction between an informal notification, a first publication and a definitive publication. They recommend four main characteristics that adhere to all publications; we will discuss them now.

1) Fixation (i.e., the document must be durably recorded on some medium).

This demand is obvious as no communication or debate independent of time and location is possible without a fixedness of the object of discussion on any technical type of platform of communication. The change into an electronic environment implies that the notion of "durably recorded" is under attack. In contrast to the paper world where we can demand that the paper is printed on acid-free paper according to an official standard, in an electronic environment, we don’t have any idea yet what an equivalent form looks like. We have no idea what kind of optical, magnetic or other medium will be selected in the coming years as the accepted standard and we have no idea what the method of writing on that medium will be. The technology push is such that almost every month we are confronted with a claim for an even superior technology. On top of that, we have to expand the notion of a document to non-pure textual objects such as images, simulations or other multimedia objects that might be the final outcome of a research programme. We don’t have to think about fashionable computer-game type presentations. The ability of electronic media to treat civil engineering design drawings (not necessarily of a complicated CAD/CAM type) the same way as textual documents is a simple example and already very difficult to tackle.

This means that the demand for fixedness must be tailored towards a demand for the inalterableness of the content of the said object. This means that we have to interpret the demand for fixedness as a demand for a well-defined descriptive standard about the content of the document. A standard that enables the storage and maintenance of the integrity of the information independent of the carrier of that information, be it a clay tablet or a future DNA chip. It goes without saying that the current developments in descriptive languages such as the Standard General Mark-up Language (SGML) and its successor the eXtended Mark-up Language (XML) are of the utmost importance. If, finally, all information in a document is properly coded according to such a language, we deal with simple ASCII, or better Unicode, strings that can be handled in all conceivable material memory structures. For integrity reasons, such a file can be endowed with an electronic watermark. For the future user of the once-stored document, only the capability to read it again from the then popular medium is of importance. For the immediate future, an interesting initiative is the NCSA Astronomical Digital Image Library (ADIL), a repository providing astronomers with research quality images strait from the telescope to their desk over the Web (Pla99).

2) Public availability (in principle not necessarily free of charge).

This demand is clearly medium-independent and does not need any further consideration here.

3) Persistence (i.e., it should remain in the same form and at the same location, so that it is reliably accessible and retrievable over time).

This point dovetails with the first demand and again we see a mix-up of old and new concerns. The persistence of the work has two aspects: the integrity of the appearance and the completeness of the content. Firstly, we have to deal with the problem of the integrity of appearance. This issue is also an important discussion point in the world of the archivist. In many cases, only the content of the information is important, e.g., the figures of a town or departmental budget. In other cases, the visual and textural aspects are essential for an archival object such as an official certification or a signed treaty. It goes without saying that for non-textual material, this issue is different than for text, and persistence of presentation form can be crucial. But also here, we cannot be too conservative as the pictorial presentation of a data set can be essential to spot a peculiar behaviour but in time, more sophisticated presentations might reveal more details. This argument leads us to the notion that we have to make a differentiation in such cases between the basic non-figurative data and the presentation of them by the original author. Both have to be fixed as forming together part of the author’s original publication. But the data set must be separately available from the presentation module in order to allow future authors to use and/or integrate these data with new data or with new presentation techniques to enable the publication of new work to take place.

Secondly, we have the aspect of internal integrity and coherence. This is typically an XML issue. This persistence aspect can be covered by the introduction of a complete list or map of contents as an integral part of every document. Not only do the bitstreams of every component of the document have to maintain their integrity, but also the mutual relations between the various components. We also need a mechanism to check that all components are present. This last demand can become a serious problem in the future. More and more documents will be rendered from components residing in different databases. Think about an astronomy article that calls for data from a huge database filled with satellite measurement data. As an electronic publication is, in principle, a modular entity and not an essay (Kir98a), the persistence demand requires that a publication guarantee that all components remain available. This demand is closely linked to the problem of dead hyperlinks. All this converges to the discussion on the Digital Object Identifier initiative. The International DOI foundation was "created in 1998 and supports the need of the intellectual property community in the digital environment, by the development and promotion of the Digital Object Identifier system as a common infrastructure for content management" ( Doi00a, Doi00b). The DOI foundation is supported by almost all major (commercial) publishers and societies. The idea behind DOI is that every item that has an assigned copyright (hence also books) will get a unique identifier. In the course of the developments, this identifier will be endowed with metadata such as bibliographic information, genre, but also publishers’ information and price. In the first round as experimented with in Crossref (Cro00), DOI is limited to a one-to-one link with the URL of scientific articles in a publisher’s data base. In the full implementation, it is envisioned that also DOI allows choices, e.g., to go to a copy of the identified entity or to a metadata record about the entity, or to an identical copy of the same entity at different location (mirror site). Adding metadata to DOI’s will allow the reader to choose which type of realisation of a particular document is wanted, e.g., as a PFD file, an XML file or whatever other storage types are available. It is clear that the DOI approach is a strong attempt to ensure the integrity of information entities seen not only as intellectual property containers but also as a step towards electronic commerce and trade with intellectual property rights.

A competitive scheme for reference linking, emerging from the scientists who are engaged in the world of pre-print servers is the Open Archives initiative. Its goal reads: "The Open Archives Initiative develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content. The Open Archives Initiative has its roots in an effort to enhance access to e-print archives as a means of increasing the availability of scholarly communication. (Oai00). The aim of this initiative is to promote those active authors for whom self-archiving solutions are a preferred option. The interoperability between such archives becomes the prime research (Som00). Though DOI and OAi are approaching the problem from two antagonistically philosophical backgrounds, both schemes at the end must ensure the integrity and quality demands that are at the basis of proper scientific discourse.

4) Version control (bibliographic record must be attached to each version. A set of minimum details is suggested in the document).

As long as we talk about a document in its traditional form, version management can be straight- forward. However, in case that a document is a composition consisting of various modules, originating from different sources, new schemes have to be developed. This issue will be further dealt with in section 3. The main point is that electronic documents are not any more the smallest exchangeable entities. Many electronic documents and most professional web pages are derived from a variety of dynamic databases. A feature of an electronic document can be that it changes with time (or outside temperature, or stock market index, or rocket launch date). This electronic document can be the result of some deep science or engineering advance, and hence be a scientific publication.

A bibliographic record (metadata) is essential for fulfilment of this recommendation. The issue of metadata that entails much more than the traditional bibliographic information is also dealt with in section 3.

Depending on the availability and affordability of the technical means, the Working Group recommends:

5) Authenticity (i.e., versions should be certified as authentic and protected from change).

Although nobody will challenge the issue of the absolute need for authentication of every published document, we run into problems if we enter the discussion of parts of a document. In referred to above, a necessary distinction is made between documents as the smallest units of communication and documents that are build-up from various components. We have to understand that there is a difference between re-use and multiple use. In the case of multiple use, the citing author integrates the full body of an "information chunk" into the new work and uses it (see further section 3.2). A good example is the case of pattern recognition. The journals in this field are full of data sets (e.g., in the form of distorted pictures) on the one hand and methods on the other hand. Would it not be much more exciting is we could swap methods and data sets between authors and allow a true comparison between different methods unleashed on the same dataset. This would be an extension of the already current practise in some fields, especially astronomy, to tap data from a common database, see for instance the French astronomical database Centre de Données Astronomiques de Strasbourg (CDA00).

For a first publication, next to version control, the Working Group recommends,:

6) Notification (the community of one’s peers must be informed as to the version associated with priority claims).

This is an obvious though essential demand for the awareness of current and new research and the free and democratic flow of information and knowledge. Notification can be enormously enhanced by electronic tools such as bulletin boards, newsgroups and current awareness services.

7) Assignment and persistence of a Web address/location for the record.

In its document, the Working Group sometimes phrases this point as the need to identify the work unambiguously. This demand is an obvious call for retrievability. Above we already discussed the DOI and OAI programmes that try to tackle this point. The DOI foundation especially is working on this issue, as its main goal is the identification and subsequent handling of intellectual property. The problem is that the persistence of a unique identification code can never be linked to a unique URL. It is much better to ensure a unique code per item, allowing that item to flow from database to database, provided that those databases have a searchable index capable of understanding the grammar of the unique identification code.

Another issue here is of a more archival nature, namely that from every serious document, at least one copy is stored safely in an archive. This is an important and strong demand in a period where paper is on the way out and a plethora of digital media, each with their own way of data handling, emerges. This point is closely related with the first point on fixation. It is also closely related with the metadata issue. It implies that a central organization such as the National Library of Congress in the USA must install a legal depot of all items with unique storage codes for ever.

8) Commitment not to withdraw (authors must agree, prior to commencing the selection process, that they will not delete the document from the electronic literature).

This recommendation is a clear statement to keep lots of free-floating drafts and worse, out of the main stream of public scientific discourse. In practise this will be a very difficult issue, as in some fields, people dump almost everything on their home web sites and feel free to send all drafts to pre-print servers. A problem arises when a second version of a draft has a slightly different title and a different number or order of authors. To impose strict adherence to version control and a commitment to not withdraw the final draft, i.e., the version open for peer review, will be very difficult, as the correction of a typo, a number or the addition or deletion of a reference, can be important as has already been proven in the paper world. A solution might be that an erratum is permanently linked to the original instead of stored separately as in paper journals.

For the definitive publication, the Working Group recommends, alongside persistence and version control, assignment and persistence of a web address.

9) Quality control (vetted to ensure quality), in order to maximise usefulness for science and to establish a high level of trust among readers.

With this issue, we enter the essence of quality control and the heated current debate on peer review. It is not the purpose of this contribution to discuss the various possible peer review schemes in detail. The literature on this issue is abundant, ranging from full scales books analysing a particular journal such as "Angewandte Chemie" (applied chemistry) (Dan93) to regular contributions on the pros and cons of double-blind refereeing, nepotism and sexism in peer review, and so on. An important new aspect is the self-publishing current in science. Here, new schemes for refereeing are regularly discussed in several internet lists and discussion forums such as the September forum (Sep98), and by individual protagonists, such as the Cognitive Psychologist and the editor of the e-journal Psycoloquy, Stevan Harnad (Hrd00).

Out of all this discussion, one thing becomes crystal clear, namely that the issue is very much domain dependent. Whilst in theoretical physics the pace of research is such that every new idea is immediately broadcast via pre-print servers, although often after internally peer reviewed by the researcher’s institute, in more experimental fields, the tempo is more relaxed. After all, it is easier to steal an idea than to redo an experiment. In medicine, the question is intrinsically more sensitive as new medical information is often rocketed to high levels of public phantasy. In this field, the discussion on ethics and misconduct is a permanent concern (Hud00). For a recent review on the domain dependency of refereeing in e-journals, see Weller (Wel00).

10) Commitment to archiving and long-term preservation.

For this point, the same holds as for the persistence point. However, the issue of long-term preservation is, more than all other issues, one of current concern. Within the archivist’s world, an enormous effort is being made to design protocols for log-term storage. As mentioned above, the problem can be split into storage of the digitalised content and storage of the textural and visual appearance as well. One important ingredient in this discussion is the scheme of Jeff Rothenberg, in which he proposes to store , next to the information item itself, also the software programs used including the operating systems (Rot99). This very intriguing so-called emulation scheme is severely under attack from XML aficionados. A less fundamental but directly applicable scheme is discussed in the "Draft recommendation for space data system standards: Reference Model for an Open Archival Information System (OAIS)". This scheme allows the storage of heterogeneous information. Also here, astronomy and space research takes the lead, as in these large fields, much information is already only available electronically (CCS99)

In the above, we critically discussed the recommendations of an International Working Group. As we have seen, the most visible tension between the very reasonable recommendations and the electronic publication, is that an electronic publication is not a paper publication stored in a different medium. In the next section, I will dwell on the unique features of electronic publications, sharpening the argument for publishing standards, which will be summarized in the last section.

3. Towards an understanding of electronic publications

As already indicated in the above discussion, before we arrive at new guidelines for electronic publishing, we need a full understanding of the differences between traditional paper documents and electronic documents. This means that we have to abstract from the current accepted practise of scientific communications in order to define societal and scientifically acceptable rules of conduct and then apply them within the context of a new environment.

The abstract notions of the International Working Group are, of course, fine; however, the problem is in the implementation. This implementation demands a better grasp of what electronic documents are. For precisely that reason, we try in the present section to make an advance on this issue in order to specify recommendations in the final section.

3.1. The most notable feature of electronic publishing is the integration of text, image, sound and simulations

The greatest step forward in scientific communication is that we are now able to use one carrier for all possible expressions of scientific knowledge. By translating knowledge into a binary code, we create a mono-medium that allows us to integrate all kinds of representation. It thus becomes immediately clear that text will play a less prominent rôle in the future. Although language will remain the essential transfer mechanism for knowledge exchange, non-linguistic communication will regain some of the prominence lost since the written language enabled scientific communications to emerge, independent of place and time.

In the electronic future, stills and moving pictures, sounds, simulations and soon also tactile information can be exchanged and experienced, hence analysed and interpreted by different people separated by time and place (Kir98b) . This means that a genuine electronic document will be a composition of text, images, sounds, animations, etc. All these components of the electronic document must adhere to quality and integrity standards. Thus, within the law of proper scientific discourse, all knowledge presentations are equal. To continue this political metaphor, we can say that we certainly need a diversity policy, to replace the period of positive discrimination of text only.

Here is not the place here to dwell at length on the differences between intuitive understanding by means of non-textual stimuli and scientific understanding through linguistic reasoning, but we must come to a realisation of the tenet that non-textual components will play a central rôle in the electronic document of the future.

In order to create an environment in which all this can be organised in a meaningful way, the first conclusion is that, in the first approximation, we have to consider all the various components as independent but interacting objects. This will lead to a modular approach of information.

3.2. The next most notable feature of electronic publishing is multiple use

In a traditional environment, an author refers to an earlier author and cites part of the original work by: referring to the original work, quoting a short or long part of the original work, or paraphrasing some of the text. This is a typical paper-based process as it relieves the new author of the need of copying extensively from an already existing text. Only in the case of images, and then often only in review papers, do authors sometimes incorporate a full illustration from another article. In standard publishing practise, the author first requests permission from the original author and subsequently the publisher requests permission from the original publisher.

However, in an electronic environment, introducing-already existing information into a new work is trivial. This is exactly the reason why the concept of modules is so crucial. In order to keep the integrity of the original work, introducing a module in a new work means introducing a complete module.

The difference between quoting and multiple use is that in multiple use, the new author can rely on the completeness and integrity of the original module. Hence, if, in a new work, a description of: a machine, the working of a medicine, or a mathematical proof is needed, reference to another work realises a new dimension. Now, we can seamlessly introduce the existing text into the new work. The old work doesn’t have to be located in a library elsewhere, but the electronic network allows us to input this information right there where it is needed.

This means that a module must be compatible with usages in different environments, indicating not that a link points to relevant information elsewhere, but rather that a link now transports elsewhere- located information into the present work.

3.3. Modularity as model for electronic documents

The idea of modularity as the next step in scientific communication (Kir96, Kir98a) is further developed by Harmsze (Har00).

Harmsze proposes a new structuring of scientific articles in modular form. A module is defined as a "uniquely characterised, self-contained representation of a conceptual information unit aimed at communicating that information". This means that a module is a textual, pictorial, or other representation, of an amount of information that in itself is sufficiently comprehensive to convey meaning for a reader. Note that neither length nor size enter the definition of a module. Although Harmsze deals mainly with modules that comprise coherent texts, the model is perfectly able to integrate non-textual modules as well. In the model, a distinction has been made between elementary modules and complex modules. Depending on the purpose, elementary modules can be merged to form complex modules just as atoms bind to molecules. Two types of such "bounded" complex modules can be distinguished.

a- A compound module is a complex module that is an aggregate of (elementary of complex) constituent modules

This is the case, if the complex module itself again represents "uniquely characterised, self-contained" information of a new kind. An easy example is the complex module that describes a measuring device and consists of a series of other modules comprehensively describing, more-or- less, independent components such as the cooling, the memory, the housing, etc.

We can compare such a compound module with a chemical molecule that is unique in itself, but can be analysed as a set of bound molecules and atoms.

b- A cluster module is a complex module that focuses on a single concept which is a generalisation of the specific concepts dealt within the (elementary or complex) constituent modules.

In this case, the complex modules host a multiplicity of the same kind of information. An easy example is the complex module of a set of PET-scans from a particular part of the brain recorded from various patients. Every scan is a module in itself, with its own specific metadata. The complex module disregards the specific, e.g., the patients name, and concentrates on the common aspects.

We can compare this kind of complex module with the chemical example of a cluster, where we have many identical atoms weakly bound together.

Modularity allows for selected reading paths so that modules can be skipped or emphasised, depending on the reader’s wish, expertise or level of understanding.

Please note that we store information units only once! The bottom line is SGML-coded objects that will change their appearance according to the document style demanded by the presentation medium

Unfortunately Harmsze’s approach is not the end of the analysis. If we discuss multiple use, we also have to incorporate other granularities of information as well, even down to a single number.

At all events, full modules or single datum must be identifiable as unique entities in a database. This means that all coherent objects must carry inseparable metadata with them.

3.4. Relations as information objects

After having defined the electronic document as a collection of independent information units or modules, the next obvious step is to tackle the mutual relationships between these modules. As a database approach does not necessarily mean that we deal with one physical storage device but that the database objects can be distributed world-wide, it is logical to concentrate on the establishment of a system of relationships that not only connects the modules but immediately defines the type of connection as well.

It is crucial in the following to realise that links are considered to be anchored on both sides, source and target, and can be traversed back and forth. This means that, e.g., the characterisation "section" in one direction indicates "belongs to" in the other direction. This is technically still a tedious problem, but within the XML environment good progress is being made (XML99).

In research, part of the work is to relate previously unrelated scientific findings within a new context. In a modular environment, this process can be enhanced. The way to do this is by naming hyperlinks in such a way that the reader knows why a link is being suggested by the author. At present, we have no clue as to why hyperlinks are added; we can only find out by clicking on them. In a structured environment, we know what the reason for this link is and we can decide to follow it or not. This brings us to the tedious discussion on hyper-link taxonomies or typographies.

Unfortunately very little has been published in the literature. Most of the initiatives are attempts towards a more-or-less complete list of possible notions (tags). In some works, a distinction is suggested between structural/ organisational relations and rhetorical or discourse relations. Our feeling is that in a distributed database environment, we have to start with a clear differentiation between at least two, and maybe three categories of relations. a) Organisational relations, describing the structural relationship of modules, e.g., hierarchical relations such: as part of, etc. b) Discourse relations describing the reasoning, such as argument for/against, an example, clarification. The discussion on this issue is ongoing and part of current research. (Har00, Kir00 and references therein); and c) context relations describing the context in which a certain relation is valid. Obviously the structure of this last category might be domain-dependent.

4. Conclusions

One goal is to establish clear and transparent understandings of what we mean by a scientific contribution, how we guarantee quality, integrity and value the intellectual ownership of its originator. In this contribution, I have tried to critically evaluate the notion of a scientific document in an electronic environment. The result of my discussion is that we have to step back from the accepted practise of paper journals; however, without the societal and scientific morale vis-a-vis quality and integrity. People, much easier then in the past can cut and paste from each other works. This dynamic cooperation has to be accepted and appreciated as an advancement in communications.

Instead of trying to curb history by conservative approaches, as some publishers try to enforce with their refusal to allow authors to post their own papers on their web site, we have to be forward-looking.

The conclusion so far is that we face a transition in which the traditional journal article will cease to exist. This means that we have to reformulate our notions about scientific documentation. In my view, which I defend in this contribution, we have to go for a distinctly different granularity of information units than that which the traditional paper one allows.

1. If we define modules as conceptual units, we can apply strict rules about quality. At present, a scientific article is peer-reviewed without any discrimination between the various kinds of information in it. In a world of well-defined modules, the refereeing standard for a module Method will be distinctly different from the module Data-acquisition. Thus, quality control will go up.

2. If all modules are endowed with a set of metadata that clearly identifies the author and time of creation, integration of a module in another work is automatically taken care with due credit being given. The DOI approach is promising in this respect. Of course, people can always retype, steal and add fraudulent data, but misconduct is a social problem and not a scientific one.

3. Another interesting new outcome of this analysis is that relations, which express themselves in hyperlinks become information objects on their own merit. As relations in an electronic environment can be typed, they become objects with metadata. Thus, we have to add the bibliographic information of the originator and a time stamp. This way, the minimum scientific publication becomes the brilliant insight of a researcher who connects two separate information units by a typed link, without any further business.

4. For documents that are built from available and new modules, we will have two levels of authentication, one on the level of each module and the other on the level of the complete new work.

5. Modular publication will have a list or map of contents with links to all components as well as a new kind of abstract that reflects the content of all modules and serves as an orientation tool in the hypertext environment. Not only is the completeness of the information part of the integrity but also the overview and a description of the mutual relationships between the components.

Therefore, the lesson of this contribution is that electronic media enhance the integration of textual and non-textual knowledge representations, enabling a proper conceptual segregation between various kinds of knowledge and therefore allowing for more specific refereeing. The flip side of these new capabilities is that we have to develop a stable system of domain-dependent metadata for modules and relations that steer the logistics and storage of these modules and relations. We can think back wistfully to the stable situation of established peer-reviewed journals we built over the last century; however, the unknown is the object of science and we are entering a new and unknown phase in scientific communication. Therefore, we have to make sure that our societal and scientific demands for quality and integrity are not mixed with the latest fashion in technology. Technology is enabling us to expand scientific communication into a serious mix of textual and non-textual components. For most of the non-textual components we don’t even have a good insight what quality standards are. Like all real advancement in science, also the development of scientific communication will go through experimental phases. From the analysis of these experiments we will be able to develop new standards and rules. It is a matter of the highest importance that the scientific community takes this experimenting serious and does not bend for conservative forces that try to restrict the developments to the known and established practises of the paper world.

References

CCS99 Consultative Committee for Space Data Systems. CCSDS 650.0-R-1: Reference Model for an Open Archival Information System (OAIS). Red Book. Issue 1. May 1999. http://www.ccsds.org/review_books.html

CDA00 Centre de Données astronomiques de Strasbourg. http://cdsweb.u-strasbg.fr/CDS.html

Cro00 Crossref. The central source for reference linking. www.crossref.org

Dan93 H.-D. Daniel. Guardians of science. Fairness and reliability of peer review. Translated by Willed E. Russe. VCK, Weinheim 1993.

Doi00a Home page Digital Object Identifier Foundation. www.doi.org

Doi00b The DOI handbook Version 0.5.1. 11 August 2000. http://www.doi.org/handbook_2000/index.html

Gar79 W.D. Garvey. Communication: The Essence of Science. Pergamon Press, Oxford 1979.

Har00 Frédérique Harmsze. A modular structure for scientific articles in an electronic environment. PhD dissertation University of Amsterdam, 2000. The full text and appendices is available via: www.science.uva.nl/projects/commphys/papers

Hrd00 See for the publications on internet by Stevan Harnad. http://cogsci.soton.ac.uk/~harnad/intpub.html

Hud00 Anne Hudson Jones and Faith McLellan (eds.) Ethical Issues in Biomedical Publication. Johns Hopkins UP, Baltimore 2000.

IWG99 International Working Group. Defining and Certifying Electronic Publication in Science. A proposal to the International Association of STM Publishers. http://associnst.ox.ac.uk/~icsuinfo/aaas-stm.htm

Kir96 Joost G. Kircz and Hans E. Roosendaal. Understanding and shaping scientific information transfer. In: Dennis Shaw and Howard Moore (eds.) Electronic publishing in science: proceedings of the joint ICSU Press/UNESCO Expert Conference Paris February 1996. Unesco Press 1996 pp. 106-116.

Kir98a Joost G. Kircz. Modularity: the next form of scientific information presentation? Journal of Documentation, vol.54, no. 2, March 1998, pp. 210-235. The final draft can be found on: www.science.uva.nl/projects/commphys/papers

Kir98b Joost Kircz. Nouvelles présentations! Nouvelle science?. In: L’écrit de la science, Writing science. Forum Européen de la science et de la technologie (DGXII), Nice 1998. Alliage no. 37-38 , Hiver 98- Printemps 99. Pp. 14-24. For an English version.

Kir00 Joost G. Kircz and Frédérique Harmsze.Modular scenarios in the electronic age. Conferentie Informatiewetenschap 2000, Rotterdam 5 April 2000.

Mea98 A.J.Meadows. Communicating Research.Academic Press, San Diego 1998.

OAi00 Open Archive Initiative. www.openarchives.org

Pla99 Raymond L. Plante, Richard M. Crutcher, Robert E. McGrath. The NCSA astronomy digital image library: from data archiving to data publishing. Future Generation Computer Systems 16 (1999), pp. 49-61.

Rot99 Jeff Rothenberg. Avoiding Technological Quicksand: Finding a viable technical foundation for digital preservation. Council on Library and Information Resources. Washington. DC. 1999. http://www.clir.org/pubs/reports/rothenberg/contents.html

Sep98 September 1998 American Scientist Forum.september98-forum@listserver.sigmaxi.org

Som00 Herbert van de Sompel and Carl Lagoze. The Santa Fe Convention of the open archives initiative. D-Lib Magazine February 2000, Vol.6. Number 2. http://www.dlib.org/dlib/february00/vandesompel-oai/02vandesompel-oai.html

Wel00 Ann C. Weller. Editorial peer review for electronic journals: current issues and emerging models. Journal of the American Society for Information Science 51(14) 2000 pp. 1328-1333.

XML99 www.w3.org/TR/NOTE-xlink-req