Publishing research about the Semantic Web


How should the Semantic Web research community publish its papers? It’s a question that more and more people are asking.

The Semantic Web community is founded on the principle of openly sharing knowledge on the Web. However, the research papers that our community publishes are often locked behind pay-walls. One of the main journals in the area – the Journal of Web Semantics (JoWS/JWS) – is published by Elsevier, whose business practices have been coming under increasing scrutiny from researchers. The proceedings of the two main conferences in the area – the International Semantic Web Conference (ISWC), and the European/Extended Semantic Web Conference (ESWC) – are published by Springer, which faces similar, though perhaps more tempered, scrutiny from researchers. There are increasingly loud calls from some members of our community to move away from both Elsevier and Springer, but such a move implies many questions. What, precisely, is the problem? What are the alternatives? What are their potential costs or downsides?

Here we’ll address some of these questions relating to academic publishing, starting with some general context on the problem. During this conversation, we’ll introduce alternative routes for publishing, and examples of successful conferences and journals in CS that have opted for these routes. Later we will also cover the question of conferences vs. journals, their relative strengths and weaknesses, and make an initial but concrete proposal for how Semantic Web research could (or perhaps should?) be published in future, intended as food for thought, but also as a way to hopefully move the conversation forwards.

A summary …

[UPDATE 2022-11-25: Adding this per the suggestion of Claudio Gutiérrez.]

As a quick summary of the main points we’ll cover:

  • Publishers such as Elsevier and Springer charge universities, research institutes and/or governments exorbitant prices for subscriptions in order for researchers to access their papers (resulting in the serials crisis).
  • Commercial publishers such as Elsevier report very high profit margins, and employ questionable business practices. This has led to widespread boycotts of such companies (e.g., The Cost of Knowledge).
  • More and more funding institutions and governments are requiring the publications they financially support to be published under Open Access (OA). Hence commercial publishers have begun to offer so-called “Gold OA“, where authors pay a once-off fee (often in the thousands of dollars) for the paper to be published OA. This has raised concerns about publishers profiting in new ways from the push for OA.
  • The costs of subscriptions and/or OA fees do not seem to reflect the fact that publishing on the Web is cheap (we will look at arXiv as an example), much of the editorial work of the conferences and journals in Computer Science are done by volunteers, we can typeset our own papers to a high standard, libraries can offer low-cost archiving services, etc.
  • Many prestigious conferences and journals in Computer Science have already moved to “Diamond OA“, where neither authors nor readers pay for OA. The Machine Learning community has been most proactive in this regard and has been pushing Diamond OA since 2001.
  • We look at reasons why the Semantic Web community continues to publish with Springer and Elsevier, why it might be time to move, and what some of the obstacles might be when considering moving to a Diamond OA model for ISWC and JWS.
  • We also cover the topic of publishing via conferences vs. journals, what are their relative advantages of disadvantages, and some possible ways in which the community could “blend” the two options to inherit the benefits of both conferences and journals while mitigating the pain points specific to either.
  • We make a concrete proposal for three new types of publishing venues for the Semantic Web based on the principles of Diamond OA, and offering the community options beyond the traditional conference vs. journal dichotomy.

These topics are covered in more detail below.

The Web, pay-walls, and the serials crisis

First some (historical) context for those new to the issue …

Recognising the convenience of the Web for accessing digital versions of papers, Springer launched SpringerLink in 1996, while Elsevier launched ScienceDirect in 1997. These are online portals with digital versions of publications available for download that remain heavily used for research today. The portals are pay-walled – requiring subscriptions or one-time payments – in order to access content online.

So long as our institution has a subscription, when we visit the pay-wall, we will get a free link to a PDF, and might not think much more of the issue (unless we are at home, and need to use a VPN, Proxy or Log-in to access the paper). But these “free” PDFs come at a significant cost to our institutions …

How much does a typical university pay to Elsevier each year for subscriptions? Elsevier has fought legal battles to avoid such data becoming public, but it would appear to depend on various factors such as enrolments at the university and their ability to negotiate; however, we know, for example, that Cornell University (with 20,000 enrolments) paid USD$2.4 million to Elsevier in 2009 for the “Elsevier Freedom Package”. We choose Cornell here as it provides funds for arXiv, which we’ll come back to a little later; you can click the previous link to see amounts for other U.S. universities. There are also some data available about subscription costs for U.K. universities, where we know that Oxford paid GBP£990,775 to Elsevier in 2013, before tax (which would add an extra ~20%).

What about Springer? We know that Carnegie-type universities in the U.S. paid about a third of Elsevier’s costs for Springer subscriptions in 2009. So Springer subscriptions are considerably cheaper than Elsevier, but in absolute terms, we can expect that they are in the order of hundreds of thousands of euros/pounds/dollars, etc., per year for larger universities in developed countries.

Unfortunately I did not find many statistics for developing countries (though probably there is more info out there on the Web). I did find that the CAPES organisation in Brazil paid out BRL$480 million (around US$100 million) for subscriptions to enable the Periodicals Portal in 2020, which provides access for Brazilian researchers to published research (this likely includes multiple publishers, not just Elsevier, Springer etc.). [UPDATE 2024-12-09: Lorena Etcheverry forwarded statistics from Uruguay, wherein the government financess access to such collections in a centralised manner via a portal called “Timbo” (controlled by I.P.). The government paid USD$1,700,000 for national access to Elsevier in the period of 2022–2025.]

The rising costs of subscriptions has given rise to what has been called the “serials crisis”; specifically, subscription costs have grown at a rate far higher than inflation for several decades. Is this due to increasing costs? More and more papers are being published, after all. We’ll come back to the costs of online publishing later, but for now, consider the profits of Elsevier, for example.

Elsevier, the Dutch publisher, is part of the holding company RELX, a publicly-listed British company. RELX itself offers a range of services, but reported revenue of GBP$2.6 billion in 2021 specifically in the Scientific, Medical and Technical domain, where about 50% of these amounts stems from publications (e.g., through ScienceDirect), while 35% stems from databases and analytics (e.g., Scopus); through the likes of ScienceDirect, they publish approximately 50,000 papers per month, with 19 million publications hosted. What about expenses? Their operating profits (total profits before tax and interest) in this domain were around GBP£1.0 billion in 2021, i.e., about 38% of revenue.

How are such profits generated?

In a 2005 report, Deutsche Bank referred to the underlying business model of publishers like Elsevier as the “triple-pay” model, where taxpayers pay for researchers to write papers, review papers, and read papers, and yet significant profits end up in the hands of publishers. The report suggests that profit margins in the industry are “extremely high, and that such publishers add little value to the research process.

We suggest that readers consider the margins (just momentarily) as taxpayers rather than investors. How happy are you, as taxpayers, that your governments are enabling private sector operators, with very little invested capital, to earn 40% operating margins?

[…]

The industry structure can only be described as bizarre – the state funds most research, pays the salaries of most of those checking the quality of research (in peer review processes), and then buys most of the published product. This has been rather elegantly described as the “triple-pay” model.

Turning the Supertanker: Deutsche Bank on Elsevier’s excess

How have companies like Elsevier managed to maintain such a lucrative position?

Elsevier find themselves in a strong negotiating position in terms of subscription fees and contracts. Companies like Elsevier hold exclusive rights on papers published with them, where only Elsevier can bestow access to papers published by Elsevier; these papers are important to researchers, and not replaceable by the papers published by other companies. Such companies are often negotiating with publicly-funded institutions, and (as mentioned before) go to considerable lengths to avoid individual institutions knowing how much other institutions are paying to avoid collective bargaining power. There are other issues with Elsevier that we could get into, but they are well-documented elsewhere.

As a result of the serials crisis, a growing number of countries and institutions are now refusing to pay the subscription fees charged by publishers, and in particular, Elsevier. National agencies in countries like Germany, Norway and Sweden have, in some years, refused to pay the subscription fees asked by Elsevier, effectively cutting off access for researchers in these countries to free PDFs via ScienceDirect.

The end result is that not everyone has unfettered access to a subscription. These researchers might be lucky enough to find a preprint through a search engine like Google Scholar, but in some cases the publication is nowhere else to be found other than behind a pay-wall.

In the absence of a subscription, and faced with a pay-wall, readers can pay out-of-pocket (or, more commonly, through research funds). The ISWC proceedings come in two parts, where Springer currently charges USD$89 for access to the eBook of one such part, while access to a PDF of one conference paper costs USD$29.95 (you may not see prices if within the pay-wall). A JWS paper costs USD$24.99. While, again, these papers are sometimes (perhaps even often) available elsewhere, such as in the form of preprints published on homepages or arXiv, this depends on the authors uploading preprints elsewhere, or on the conference/journal managing a preprint server, which is not always the case. Even when a preprint is available elsewhere, it’s not always clear how these unofficial versions differ (if at all) from the official publication.

The push for Open Access

Pay-walls, the serials crisis, the business practices of publishers, and more besides, have led to pushback from individual researchers, institutions, and governments. With respect to individuals, the Cost of Knowledge is a boycott of Elsevier by 20,427 researchers at the time of writing, who refuse to participate as authors, reviewers, editors, etc., with any Elsevier publication. With respect to institutions, as aforementioned, many are refusing to pay the subscription fees asked of them by companies such as Elsevier. With respect to governments, legislation in various countries – including the U.S., the E.U., Chile, etc., – have established various types of legal mandates regarding how the research its taxpayers fund is published, typically requiring OA publication of all such research works.

We thus find ourselves in the midst of a global push towards Open Access (OA) publishing of publicly-funded research.

Types of Open Access

Open Access generally involves making publications and other resources available for free to all interested readers, typically over the Web. It does not by itself, however, preclude the possibility of authors or their institutions paying for such services. Nor does it necessarily require that the publishers offer free access to the publications. There are different naming schemes for different types of OA, but we will look at some of the types of OA most relevant for our community:

  • Black OA: Publications are pay-walled, but are freely available through databases (e.g., Sci-Hub) or other means (e.g., requests on Twitter via the #ICanHazPDF tag) that breach copyright or other legal agreements.
  • Green/Grey OA: Official publications are pay-walled, but their principal content is made available for free to all in ways that are compatible with publishing agreements, and often as informally-published grey literature, such as preprints published on author homepages or servers like arXiv. (As a side note, “Green OA” is, in my opinion, a highly misleading name as green is associated with approval; I think “Grey OA” is a much better name, indicating the grey nature of this form of OA.)
  • Gold OA: This involves final publications being made for free on the publisher’s online portal, but involves authors or their institutions paying article processing charges (APCs), i.e., publication fees. Content may also be liberally licenced for reuse (CC), often with the attribution (BY) clause to require citation (if the licence for the publication is missing, but the publication is freely available through the publisher, it may sometimes be referred to as Bronze OA).
  • Diamond/Platinum OA: This involves final publications being made openly available for free on the publisher’s online portal without APCs.
  • Hybrid OA: Some, but not all, papers for a conference proceedings or journal are made available under OA. This is an increasingly common practice where authors who want OA though the publisher pay APCs; if authors do not pay, the publication will be pay-walled.

When people speak about OA, they often implicitly assume Hybrid, Gold and/or Diamond OA.

We note here the models employed by the main conferences and journals in the Semantic Web are:

JWS, ISWC and ESWC have opt-in APCs that authors can pay to make their papers freely available via the publisher’s portal; if they do not pay, the papers will be pay-walled on the publisher’s portal (but may otherwise be available through Green OA). SWJ, under Gold OA, requires all authors to pay APCs so all papers are published OA (though fees for authors from some countries are waived).

Advantages and disadvantages of Hybrid & Gold OA

Given the trend towards OA, including the national OA mandates previously mentioned, most commercial publishers in the academic space have expanded their business model to include options for paying APCs, where instead of the reader (or their institution) paying publishers, in theory at least, it’s the author (or their institution) that foots the bill. Typically authors can opt-in to pay APCs, creating a Hybrid OA model. Some such journals require authors (with exceptions) to pay APCs, creating a Gold OA model.

Proponents of Hybrid/Gold OA point to a number of advantages. First and foremost, readers have free, direct access to the final version of publications through the official publishers’ portals, as found in prominent indexes, accessible via DOI, etc. They do not need to resort to Black OA. They do not need to search through the Web for Green OA/gray literature and wonder to what extent the version they found reflects the final revised version, or what the differences might be. Publishers no longer hold exclusive control over the content published under Hybrid/Gold OA with them. Importantly, authors will often hold the copyright for their (unpaid) works, and will also avoid exclusive publishing agreements, removing legal barriers to share, extend or adapt their work as they see fit.

But Hybrid/Gold OA has drawbacks. Principally, authors must find funding to pay APCs, which can be costly.

More generally, while more and more research grant bodies are including categories of expenses for APCs associated with OA, such fees can be a considerable drain on a research budget, leading authors to continue to prefer Green or No OA in order to prioritise other research costs (e.g., funding students, conference travel, etc.). In the case of Chile, for example, where key national agencies have begun to mandate OA publishing for the research they fund, a study by Krauskopf has found that about USD$9,129,939 was spent in 2019 on APCs for OA papers with a Chilean co-author. He remarks that:

Unfortunately, the cost of publishing in some journals is so high that it causes detrimental effects on the research capacity of under resourced individuals.

Article processing charge expenditure in Chile: The current situation

[UPDATE 2024-12-09: Lorena provides a link to this paper on APC costs for Uruguay, which suggests that the costs were approximately USD$200,000 in 2022, and appear to be rising rapidly.]

The Gold OA option, in particular, can push considerable costs to authors and their institutions, where those with limited research budgets may no longer be able to publish at Gold OA venues with high APCs. It would seem to follow that if the most prestigious venues in a community charge high APCs, under-resourced researchers will be marginalised within this community.

[UPDATE 2022-11-18: Jérôme in the comments points to the OpenAPC website, hosted by the Bielefeld University Library, which tracks how much individual organizations are paying in APCs, as well as which journals and publishers receive the most in APCs, with filters for year, country, etc. From this, we can see, for example, that well-known publishers are receiving millions in revenue each year as APCs. We can also see how much APCs are draining the resources of individual institutions.]

How costly is Hybrid/Gold OA for publishing Semantic Web research? How much are the APCs?

For a researcher publishing several papers a year, clearly Gold OA can constitute a considerable drain on their research budget. Even though SWJ’s publication fees are reasonable when compared with the fees charged by Elsevier and Springer, USD$550 is not nothing, and will have opportunity costs (in my own case, it may mean less funds to pay students, or for our travel, etc.).

Hybrid/Gold OA has also given rise to novel concerns about the business practices of publishers like Elsevier and Springer. In particular, Hybrid OA (the current model for JWS, ISWC and ESWC) raises questions surrounding “double-dipping“, whereby authors pay APCs to make their publications available for free via the publisher’s portal, but the publisher still sells subscriptions to the journal that include the OA (and other non-OA) publications. While publishers such as Elsevier claim not to double-dip, and they certainly do not charge one-off fees for OA publications, subscriptions are often bundle deals, either for a particular journal, or a particular set of journals, and it’s not clear that OA publications are being “deducted” from the prices of such subscriptions (nor is it clear what this means precisely, given that in the context of a company like Elsevier, subscription fees are not tethered to costs, costs are not transparent, etc.). Along these lines, Science Europe has stated that:

The hybrid model, as currently defined and implemented by publishers, is not a working and viable pathway to Open Access. Any model for transition to Open Access supported by Science Europe Member Organisations must prevent ‘double dipping’ and increase cost transparency.

Principles for the Transition to Open Access to Research Publications

For more on the issue of double-dipping, see this report by Mittermayer.

If Hybrid/Gold OA have such drawbacks, what, then, of the possibility of Diamond OA, where neither readers nor authors are charged? How are the expenses covered? Has any conference or journal managed this?

The feasibility of Diamond OA

Publishing papers online in a persistent manner while charging zero fees for publishing or accessing those papers seems to be unrealistic. Servers cost money. Tech support costs money. Administration costs money. And so on. Who’s paying for this? And how much? How reliable are these services?

Given that there is a question of confidence, we can start with an imperfect example, but one that many readers will be familiar with. Regarding the cost of open and persistent publishing, at the time of writing, arXiv has published 2.2 million papers online, at a rate of about 15,000 papers per month, without fees to authors nor readers. The example is imperfect as arXiv is not a journal, nor a conference, nor it is peer-reviewed: it’s a platform for publishing preprints. But it implements many of the key ingredients for Diamond OA (we will talk about examples of Diamond OA peer-reviewed conferences and journals later, some of which are built over arXiv, some of which are not).

So who covers the costs? arXiv’s costs in 2021 were USD$2.4 million. This corresponds to about USD$16 per newly published paper in 2021, or a little more than USD$1/year when considering all papers hosted; of course, these amounts somewhat overestimate the cost of publishing a new paper, or hosting an individual paper per year, where most of arXiv’s costs relate to personnel, which does not relate linearly to number of publications, and is probably closer to O(1) with respect to publication number. The costs are covered by Cornell University, which hosts the service, the Simons Foundation, and member contributions.

To put these costs in perspective, arXiv’s collection of 2.2 million papers can be compared with Elsevier’s collection of 19 million papers, and arXiv’s publication rate of 15,000 papers a month can be compared with Elsevier’s rate of 50,000 papers per month. arXiv’s total costs of $2.4 million in 2021 can be contrasted with Elsevier’s revenue of GBP$2.6 billion in 2021, and operating profit of GBP£1.0 billion in the same year. Even taking the 50% of the Elsevier figures that relate to publications (leaving out, e.g., database access, analytics, etc.), and adjusting for the number of publications handled, the difference with arXiv’s costs still involve multiple orders of magnitude.

One might argue that comparing arXiv with Elsevier is comparing apples with oranges, and this is true considering the different scopes and services they provide, but arXiv still manages to satisfy what would seem to be the key ingredients of Diamond OA: publishing papers online in a reliable (and scalable) way with no fees to authors, nor to readers. It’s also worth keeping in mind that Cornell University (which hosts arXiv) paid USD$2.4 million to Elsevier in 2009: almost the exact same amount as arXiv cost to run, in total, in 2021, and almost three times what Cornell contributed to arXiv in the same year, without adjusting for 12 years of inflation: one large U.S. university’s annual subscription to Elsevier alone could pay to run arXiv, in its entirety, for one year, and with considerable change to spend elsewhere.

One might rightly point out that, unlike Elsevier, arXiv also does not provide authors with services like copy-editing, typesetting, etc. But personally speaking, the latter is a plus in the arXiv column, as Elsevier does not use LaTeX for typesetting, but rather converts LaTeX sources to its own typesetting mark-up, and the resulting proofs, in my experience, often introduce errors that the authors need to check and correct (and look uglier, in my opinion). I’ve also never had a paper heavily copy-edited by any journal (though I have the benefit of being a native speaker). I have seen some light copy-editing regarding formal elements like affiliations and references, but in such a way that does not, in my opinion, change all that much in terms of the content of the research paper.

One might also point out that arXiv does not handle peer-review (rather there’s a lighter moderation process). But peer-review in our community is done for free, by volunteers. And there are examples of prestigious peer-reviewed journals that charge no fees to authors nor to readers. Various open and free platforms now exist for handling peer review, such as OpenReview.net.

The classic example (though not the first nor the only of its kind – we will list various other examples later) of a Diamond OA peer-reviewed journal is that of the Journal of Machine Learning Research (JMLR), which began life not in 2021, but rather in 2001, when 40 Editorial Board members resigned from the journal Machine Learning (published at the time by Kluwer, and now by Springer) stating that:

In summary, our resignation from the editorial board of MLJ reflects our belief that journals should principally serve the needs of the intellectual community, in particular by providing the immediate and universal access to journal articles that modern technology supports, and doing so at a cost that excludes no one.

Editorial Board of the Kluwer Journal, Machine Learning: Resignation Letter

What about the prestige of JMLR? We could cite high impact factors that outstrip those of its non-OA predecessor, but such metrics are provided by the likes of Elsevier and Clarivate, and, though influential, are problematic in pretty basic ways. We might rather (also) consider the following remarks by (Turing Award Winner) Yann LeCun in response to a blog post arguing for the benefits of commercial publishing in the academic field:

I am a computer scientist. The best publications in my field are not only open access, but completely free to the readers and to the authors. The best example is the Journal of Machine Learning Research: its website is hosted at MIT (marginal cost = 0), its reviewing/selection process is performed by peers on a voluntary basis (as with every serious journal), and there is no need for an editorial and production process because it turns out that computer scientists are good with computers: they can actually do the typesetting of their own papers. That’s the way of the future.

Uninformed, Unhinged, and Unfair — The Monbiot Rant (comments section)

So if JMLR – a prestigious, fully-indexed, peer-reviewed journal – does not charge any publication or subscription fees, how does it cover its costs? The answer, which is explained here in detail by a publisher associated with the journal (for those with the time, I highly recommend clicking that previous link for more details), is not at all surprising:

By far the largest costs are the labor required for peer reviewing and its management by the editorial board, but this is all volunteer effort as in most all scholarly journals. The primary people involved, the editor-in-chief, managing editor, and production editor, are all unpaid […]. MIT implicitly underwrites some clerical help, since Kaelbling’s administrative assistant at MIT does a small amount of work for the journal, amounting to a few hours per year.

The webmaster is a student volunteer. […] MIT provides the web server, saving JMLR the tens of dollars per month they would otherwise have to pay for commercial hosting. Kaelbling has paid for the domain name jmlr.org out of her own pocket. The going rate for .org domains is about $15 per year.

[…]

The biggest expense, it turns out paradoxically, is paying a tax accountant.

An efficient journal

This 2012 post concludes that for the almost 1,000 articles the journal had published at that time, the financial costs worked out at around USD$6.50 per article. In other words, JMLR publishes 1,000 OA articles with costs equivalent to what Elsevier currently charges for publishing 3 OA papers in JWS. We also mention that the tax accountant’s salary – the biggest chunk of that cost – is presumably ~O(1) with respect to the number of papers published. And it seems like those tax costs will come down. (Worth a mention that printed versions of JMLR issues are also available, on demand, at a relatively low cost.)

JMLR set a precedent in the Machine Learning community: key conferences, like NeurIPS, ICML and ICLR, would later follow suit and publish under Diamond OA. In response to the new journal Nature Machine Intelligence that was launched in 2019, a total of 3,609 researchers at the time of writing – including Yoshua Bengio, Yann LeCun, Jeff Dean, and many others – pledged to not have any involvement with the new journal, stating that:

We see no role for closed access or author-fee publication in the future of machine learning research and believe the adoption of this new journal as an outlet of record for the machine learning community would be a retrograde step. In contrast, we would welcome new zero-cost open access journals and conferences in artificial intelligence and machine learning.

Statement on Nature Machine Intelligence

In response, the journal made some allowances to publish preprints linked from the site, and has done well in terms of metrics like impact factors since release, but is still the subject of a major boycott within the community.

So why is the Semantic Web community still publishing with Elsevier and Springer in 2022?

Twenty years on from the inauguration of JMLR, which installed free OA as the expectation for Machine Learning research, the Semantic Web research community is still publishing papers with Elsevier and Springer. And though the Semantic Web journal offers more reasonable rates, USD$550 is not nothing.

When JWS, ISWC and ESWC first started out in the early 2000’s (around the same time that editors resigned from MLJ to form the Diamond OA alternative JMLR), being associated with established publishers was inarguably a useful boon in terms of prestige, confidence, etc., and being associated with these publishers has (in some cases) lent more “weight” to the respective publications in the eyes of certain universities and research institutes. This was likely an important ingredient for attracting submissions, forming a community, etc. Being associated with Elsevier and Springer was, likely, an important “leg up” for the venues, and for the incipient community.

Still, the Semantic Web community is intertwined with the principle of openly sharing knowledge via the Web. For this reason, senior members of the community managed to negotiate various ways to make the papers published in ISWC, ESWC, JWS and SWJ available for free, on the Web, with no cost to authors nor readers. This involved free preprint servers and other forms of grey literature under Green OA, special deals to make papers available for free for a limited time, etc. This was seen as a reasonable compromise.

However, institutions are still paying money for subscriptions to Elsevier journals and Springer proceedings/journals, including ISWC, ESWC and JWS. And in the case of Elsevier, in particular, popular boycotts such as The Cost of Knowledge, continue to gain momentum. Some researchers are refusing to participate with JWS as a result. A growing number of countries and institutions are now refusing to pay the exorbitant subscription fees charged by publishers such as Elsevier, causing complications for researchers to access research (requiring recourse to grey literature). As an anecdote, recently I was PC Co-Chair for ISWC 2022’s Research Track, and the temporarily free links to papers in the Springer Proceedings were not working during the conference, meaning that attendees could not access papers at the start of the conference until the issue was resolved by Springer, further meaning that we had to send preprints from Easychair to session chairs so they could have a look before the talks.

So what are the barriers with moving ESWC/ISWC/JWS away from Elsevier and Springer right now, and going down the route of JMLR – and prominent conferences like NeurIPS, ICML and ICLR – towards zero-fee OA publishing? I think there’s a number of issues to overcome:

  • Inertia: Changing publishers requires considerable organisation, planning leadership, effort, etc.
  • Publisher prestige: Though difficult to get concrete statistics or details on how broad the issue is, some institutions may still weight (in particular) conference publications with established publishers, like Springer, higher than community-led or purely online publishers, particularly in the absence of more widely-recognised metrics for conferences. Moving away from Springer might have consequences, in terms of the recognised research productivity of ESWC/ISWC authors, disincentivising submissions.
  • Metric reset: Related to the previous point, if JWS moves away from Elsevier, it might require a reset in terms of metrics like impact factor, etc. It might take some years for papers to be picked up by bibliographic indexes. Again, this might lead to journal papers being undervalued by certain institutions for a certain period of time. Something similar might happen for conferences if they split; however, changing publisher should not otherwise affect conference metrics or ratings provided by CORE or Google Scholar in the same way (perhaps Scopus might be an issue).
  • Legacy issues: Relating to the previous point, if the community largely moves away from JWS towards a new journal, it is possible that the journal will continue in new hands, or that it will cease to exist. A drop in metrics and prestige may follow, which, in turn, may reflect poorly on legacy papers published with the journal before the move.
  • Legal: In some cases there may exist active contracts with publishers.
  • Empathy: Employees of Elsevier and Springer are familiar faces at events like ISWC: funding lunches, sponsoring awards, helping the community to negotiate reduced costs or additional OA features, etc. Members of the community have colleagues who work with these companies, collaborate with these companies, have worked previously with them, or still work for them. To be clear, I do not mean to imply any sort of wrong-doing, rather I just mean that many members of our community associate such companies with a friendly face, or multiple friendly faces. At least I do. While this is likely a minor factor, it is still a factor perhaps worth mentioning.
  • Disagreement: Not all members of the community agree that publishing with Elsevier and Springer is a problem. Other members might agree in principle with cheap Gold OA, or Diamond OA, but are put off by the uncertainty, risk, etc., associated with such a move, find the idea to be naïve, or are satisfied with the status quo. Others still might agree that cheap Gold OA or Diamond OA are the right way to go, but are unclear on how to move forward, or disagree on the implementation.

Assuming we can find a suitable implementation that a large part of the community agrees with (the subject of the next sections), most of the other issues seem feasible to overcome. However, the issues of Publisher prestige, Metric reset and Legacy issues, all of which involve how our publications are valued by our respective institutions, may be complicated, and may imply certain costs.

What are the OA alternatives for the Semantic Web?

As mentioned before, a number of research sub-communities within Computer Science (and other areas) already consider Gold or Diamond OA to be the standard, rather than the exception. There are already a variety of prestigious journals and conferences published under Diamond OA or inexpensive Gold OA that have established the feasibility of this route.

Diamond OA Conferences

We’ll start with Diamond OA options and precedents for prestigious conferences (CORE A or CORE A*; ISWC and ESWC are currently CORE A, for comparison; their use as a yardstick of prestige here is not meant to endorse CORE ratings):

ICLR, NeurIPs and ICML are amongst the top conferences in CS according to many metrics; for example, Google Scholar currently ranks ICLR at position 9, NeurIPS at position 10, and ICML at position 19 of all academic publications in terms of h5-index, including conferences and journals from all areas of science (not just CS). All conferences mentioned are indexed on DBLP. One shortcoming I can see is that papers for ICLR, NeurIPs and ICML have not been assigned DOIs on DBLP at least, nor have the proceedings themselves ISBNs that I found, but I think this should not be problematic to resolve: IJCAI, KR and EDBT do have DOIs and ISBNs on DBLP.

One could probably expect more options to come online as institutions push for Diamond Open Access. For example, the Action Plan for Diamond Open Access, published in March 2022, sets out goals to establish standards and best practices for this type of publishing, and has backing from major institutions.

[UPDATE 2022-11-18: Manfred in the comments mentions the possibility of publishing conference proceedings via CEUR-WS, and that this would be possible with a “frame agreement” whereby volunteers in the Semantic Web Community assist in preparing and publishing the proceedings on the platform. A similar process is already in place for the AIxIA Conference. As many Semantic Web workshops are already publishing with CEUR-WS, this could indeed be an interesting route.]

Inexpensive Gold OA Conferences

Other potential options involve inexpensive Gold OA offered by non-profit publishers. While these options include APCs, they tend (with some exceptions) to be a small fraction of those charged by commercial publishers, and could potentially be covered by the community (e.g., by conference sponsors, donors, registration fees, etc.), thus providing Diamond OA:

As ACM is a non-profit organisation, it’s worth mentioning that ACM Press offers some options for Open Access, but APCs under Hybrid OA are still quite expensive, at USD$700–1700 per paper. The ACM ICPS series offers some cheaper proceedings-wide options for Open Access, but these fall into Green OA, enabling links from the conference homepage to an OA version of the paper on the publishers website; however, arriving from elsewhere (e.g., via DOI), the paper will be pay-walled.

We must also cover IEEE, which is likewise non-profit. However, looking at some of the Open Access journals it publishes, APCs are in the order of USD$1,850. I did not find options for OA in conference proceedings (like ICDE).

Aside from LIPIcs, Dagstuhl also offers the proceedings series OpenAccess Series in Informatics (OASIcs) for more informal events like schools, workshops and new conferences, and Dagstuhl Artifacts Series (DARTS) for publishing artefacts associated with research that has been peer-reviewed. For ISWC or ESWC, LIPIcs would seem the more appropriate option.

[UPDATE 2022-11-18: On Twitter, Sören Auer mentions also the possibility of publishing with TIB Open Publishing, which supports Diamond OA Conferences and Journals. DOIs and eISSNs are assigned for all publications. There are costs associated with publishing on the platform, in the order of EUR€1,300/year for a journal publishing 15 articles annually.]

Diamond OA Journals

What about journals? A number of prestigious journals are already offering Diamond OA. We spoke previously about JMLR, but it’s not the only journal in this space. More and more journals, particularly in Machine Learning and Theoretical Computer Science, are moving towards a Diamond OA model. Here we discuss some of the models under which they are being published, and highlight a number of prominent examples.

Again, this list is not intended to be complete, but rather to highlight some prominent or otherwise interesting examples of Diamond OA. Other such journals can be found in the Free Journal Network. We exclude Open Research Europe as it is only available to EU-funded researchers.

Non-profit/Inexpensive Gold OA Journals

Aside from Diamond OA journals, I did not find any examples of both non-profit and inexpensive Gold OA journals, i.e., with low (but non-zero) APCs. The lowest APCs I found were SWJ’s, at a cost of USD$550 or EUR€500 per paper to authors: even though IOS Press is a commercial publisher, SWJ’s APCs are considerably lower than those offers by non-profit organisations like ACM and IEEE, where ACM Gold OA for journal papers is $1,300 for members and $1,700 for non-members, while IEEE Gold OA for journal papers is around $1,850 and upwards. Fees can be waived in some cases.

[UPDATE 2022-11-18: On Twitter, Sören Auer mentions also the possibility of publishing with TIB Open Publishing, which supports Diamond OA Conferences and Journals. DOIs and eISSNs are assigned for all publications. There are costs associated with publishing on the platform, in the order of EUR€1,300/year for a journal publishing 15 articles annually.]

Outlook for Gold/Diamond OA publishers

There are now a number of highly successful conferences and journals publishing under Diamond OA, particularly in Machine Learning and Theoretical Computer Science. Though not trivial, perhaps it’s time for the Semantic Web community to follow the precedent set in these areas, and to leave behind publishers like Elsevier and Springer. There are a number of options on how we could do this, and choosing between them is not so easy. Particularly promising options in my opinion include LIPIcs for conferences (indeed many conferences have already migrated from Springer to LIPIcs), and Episciences or (if feasible to find sponsors) MIT Press Diamond OA for journals.

The conference situation looks a bit more straightforward as an event like ISWC will not lose its identity and metrics due to a change in publisher; however, it would be important to consider whether or not this might affect how publications would be valued by different institutions.

In the case of the journal, changing publishers may require starting with a new journal from scratch; ideally this could be avoided, but the sooner this is done, the faster the new journal will build a reputation, be indexed, collect metrics, etc. The key aspect for the medium-to-long term is that the community stands behind this new journal and its Diamond OA model, as a complement to SWJ’s relatively cheap Gold OA option.

It is perhaps worth recapping some of the benefits of moving to a Diamond OA publishing model for the Semantic Web:

  • More economical publishing: The obvious benefit is that less money is spent from institutional and research budgets on disseminating and accessing research, creating opportunities to use funds elsewhere. This could also help to make our research community more inclusive by reducing financial barriers.
  • Authors keep copyright and reuse rights: Authors do not sign away the copyright for their works, nor would they be restricted in how they re-use material (e.g., publishing slides, creating extended versions, etc.). Under licences such as CC-BY, others could also reuse material, with citation.
  • Better control of formats: Depending on the final Diamond OA model, the community would no longer be bound by publisher norms regarding publishing, for example, in terms of the template for papers, paper lengths, typesetting used, or the medium (e.g, PDFs vs. webpages with embedded meta-data, video, audio, etc.). The community would have more freedom regarding how, for example, supplementary material such as appendices would be published.
  • Open analytics: With the full-text of papers available under more liberal licences, this would give researchers the ability to download and process full-text papers from the community, which could lead to advances in new metrics, new recommendation systems for related works or potential collaborations, etc. Currently such research is stifled by pay-walls and publisher agreements. As aforementioned, Elsevier has previously taken steps to avoid mining over the publications they host (though they later clarified, somewhat puzzlingly, that it’s okay to do this via their API). Coupled with the previous point, the formats could be adapted to better enable such analytics, and to enhance machine readability, with authors benefiting through increased visibility of their work, and researchers benefiting from more automated methods to find relevant papers, authors, etc. In fact, this could become an interesting area of research for our community, and something in which we could “lead the way”.

Rethinking the conference/journal model

[UPDATE 2022-11-25: Claudio Gutiérrez reached out to mention that he wrote an article some years ago specifically on this topic of conferences vs. journals, which provides some interesting statistics and raises some excellent points not originally covered here. I’ve incorporated some of those points here marked with an *, but I recommend checking out his article for more info. He also provides a lot of references to related reading.]

Aside from changing publishers of our existing events, there may be an opportunity to rethink how we publish research on the Semantic Web in order to create new models that better serve the community, and address existing pain points relating to publishing peer-reviewed research, such as:

  • Under-valued publications: In different institutions, different metrics and heuristics are used to assign different weights or categories to different types of publications. Some institutions (such as my own) may still not assign much value to conference papers in reviews or promotions, for example, as in most areas they are mostly extended abstracts for talks rather than vehicles for disseminating primary research results. Other institutions might put a lot of weight on CORE rankings, acceptance rates, associated publishers or other organisations (IEEE/ACM), submission numbers, etc., for conferences. In the case of journals, metrics like impact factors, or details like publishers, may play a role in determining a publication’s value. For these reasons, some publications may not be assigned the value they merit.
  • Discussion post-publication*: Oftentimes the publication of a paper is only the start of the vetting and discussion process. As the ideas of a paper are adopted, the datasets reused, the code extended, etc., by the broader community, sometimes errors, limitations, or new perspectives may come to light. This is somewhat contrary to the typical conference/journal dogma that the published version is somehow “final”. Applying fixes or updates to conference or journal papers is costly, requiring publishing a letter for more significant errata, or perhaps listing smaller errata on a personal homepage or addressing them in a more informal way (e.g., updating a preprint). Likewise, there is often no systematic way to collect and organise post-publication feedback from the community with respect to published papers.

Other pain points relate specifically to conferences or journals.

Conference pain-points

For conferences, there are specific pain points, including:

  • Annual deadlines: Our key conferences have annual deadlines, and missing these deadlines, or having a paper rejected, implies having to wait another year to try again. In some cases we might have several relevant papers for a conference that are stressful to prepare for a single date. The same can also apply for reviewers, who have to review a batch of papers within a relatively short time-frame that could perhaps be split over the year.
  • No revisions: At conferences, there may be promising papers that, with some revisions, would make great contributions, but without these revisions, are deemed not ready for publication (as revisions are not typically supported in the conference model). The yes/no style of conference reviewing is costly for authors, requiring them to submit elsewhere, potentially leading to rejection for an entirely different set of (possibly subjective) reasons, creating a moving target.
  • Lack of review continuity: Relating to the previous point, but focusing on the reviewer side, the majority of papers rejected from conferences will be resubmitted elsewhere, and likely reviewed from scratch by a fresh set of reviewers. If resubmitted multiple times, this can imply certain global inefficiencies when compared with a journal model where major revisions permit authors to revise their work, prepare a cover letter of changes, and whereby the reviewers will be invited again, and can focus on those changes in the next submission. While major revisions directly imply more work for reviewers, who must check the revised version as well, overall it should (arguably?) create global efficiencies by having more continuity in the review process. Also it avoids the risk of the fresh set of reviewers missing key limitations raised previously and left unaddressed by authors (as co-chair of ISWC 2022’s Research Track, reviewers of two papers pointed to plagiarism concerns raised in the review process of the same papers in other conferences in which they were involved, which indeed were valid concerns).
  • High selectivity: Prestigious conferences are often associated with low acceptance rates, often in the range of <20%, but sometimes even (much) lower. Acceptance rates are, in turn, used informally, in some settings, as a measure of conference quality. Programme Committees often implicitly understand this selectivity, and may reject a paper not for explicit technical faults, nor for a lack of novelty, but rather for not meeting the “bar” of the conference. Accepting certain papers, in their mind, may lower this “bar”, and thus demerit the conference (and potentially the papers that they themselves have published there), or otherwise open the floodgates for “trivial papers” to be published. The issue is that this “bar” is completely subjective, and may lead to otherwise useful research papers being rejected.
  • Tight page limits: A common phrase seen in conference papers is something along the lines of “omitted for lack of space“. While tight page limits can help to reduce reviewer workload by making authors focus on what matters, it can also sometimes increase it by reducing the quality of the paper, the depth of explanatory material (e.g., examples) provided, etc. There is certainly a debate to be had about what reasonable page limits are, but currently this is also tied to factors relating to the publisher. Some publishers, for example, charge for OA by the page. Currently, in order to present work at a conference, it is largely necessary to conform to page limits that might be restrictive for a given body of research. Page limits for ESWC and ISWC are particularly tight when compared to other conferences due to the compact Springer/LNCS format.
  • Undervaluation outside CS: Though many CS departments have won the battle to have conference papers be assigned due consideration at the Faculty or University level alongside, or even above, journal papers, in other institutions, conference papers are not considered as having much value, based on the fact that in most areas, conference papers are extended abstracts for talks rather than ways of conveying primary research. In these latter institutions, conference papers might be gravely undervalued with respect to the work they imply, the results they convey, and the impact they generate.
  • Travel: Traditionally there is an expectation that at least one author will travel to the conference to present the work. While in-person conferences have some major benefits, travel can be expensive in terms of time and money. Most conferences tend to be held in the Northern Hemisphere, meaning that such costs are aggravated for the Global South. Added to this are concerns about the impact on climate generated by flights, questions of child-care, getting visas, etc., which may affect different demographics of researchers in different ways. (This of course relates to the question of in-person vs. hybrid conferences, in future.)

It’s quickly worth noting a pain point with workshops:

  • Indexing of proceedings: Many workshops in the Semantic Web area publish their proceedings with CEUR (Diamond OA), but these proceedings require a minimum number of papers and a minimum number of pages per paper, which smaller workshops do not satisfy, resulting in no proceedings, or (sometimes messy) joint proceedings. Also, CEUR papers are not assigned DOIs, are only partially indexed even on DBLP, etc.

Journal pain-points

For journals, there are other pain points, including:

  • Unpredictable timelines: The times taken to receive the initial reviews in journals, or to reach final decisions, or for the paper to be published, can be highly unpredictable. Getting a journal paper published, in some cases, may take years, an issue that is exacerbated when a traditional publisher must physically print the issue. This creates considerable friction with the fast-moving nature of research in our discipline.
  • No presentation: Unlike conferences, papers published in journals are not presented to the community in an event like a conference, and thus it can be more difficult to seek feedback from or more generally discuss or advertise results within the broader community.
  • Poor reviewer incentives: Reviewers for journals are often selected in an ad hoc manner. While at least in the case of conferences, (most) reviewers will at least be recognised as PC members (which, though not much, is something), in the case of journals, oftentimes only the editor will know of the work conducted by the reviewer. As a result, reviewing for journals is a particularly thankless work, and, in turn, it can be difficult for editors to find reviewers for papers, creating delays for journals. (For this reason perhaps, JWS includes regular reviewers as part of its Editorial Board.)
  • Copyright issues: Presenting work at a conference has several benefits, as per the previous two points. Thus a common strategy is to present a preliminary work at a conference first, and then present an extended definitive version of the work as a journal paper. Most journals accept this practice as legitimate, and a valuable way of presenting follow-up results that would not fit in the tight confines of a conference paper. However, if the conference paper is published with Springer and/or the journal paper with Elsevier under a traditional model where copyright is passed to the publisher and/or they are given exclusive publishing rights, formally speaking at least, this prohibits the authors from re-using conference material for the extended version, even with appropriate citation and explanation of novelty, creating a “grey practice”, or partially-enforced requirement, to rewrite content so that it’s not the same between both versions, even if the additional research value of such a rewrite is questionable, and even if the journal version clearly extends the conference paper with significant results. (A similar issue can happen between workshop or arXiv’ed versions of works; though admittedly, to the best of my knowledge, publishers do not tend to pursue such cases, this is really dependent on them.)
  • Static Editorial Boards*: Compared to conferences, which will typically have a high churn of new chairs and new (S)PC members each year, bringing with them new ideas and perspectives, journals often tend to have a more static Editorial Board and Editors-in-Chief. Journals can thus lack injections of new ideas or perspectives, can lack representation of members coming from new sub-communities or people expert in newer topics, etc. This can somewhat limit innovation in the case of journals (and arguably conferences too, though I would think less so).

Hybrid models

The traditional dichotomy between publishing conference papers and journal papers creates some difficult choices. Some particular pain points are addressed by initiatives such as having a journal track at key conferences, or having hybrid conferences where people can present their work online, or giving (top) reviewers awards or other titles. But researchers are increasingly starting to question whether this conference vs. journal situation is really productive for Computer Science. And more and more communities are starting to look at models that blend conferences and journals in ways that capture more of the benefits (and fewer of the shortcomings) of both.

An interesting example is that of the International Conference on Very Large DataBases (VLDB), which is a top databases conference, and for some years has published its proceedings as Diamond OA journal issues (under the Proceedings of the VLDB Endowment), organising monthly deadlines, having a Programme Committee of reviewers, page limits, a one-shot major revision process, and an annual conference for presentations. In order words, the community has picked elements of conferences and journals that they feel work well for their members, creating a hybrid model that remains popular among researchers. (P)VLDB is ranked both CORE A*, and has a JCR journal impact factor of 3.557.

A similar example is that of the Transactions of the Association for Computational Linguistics (TACL), where authors submit conference-length papers that are published as journals, with submission deadlines every month, and with the option to present accepted papers at a range of ACL-affiliated conferences (it is not included in CORE, but it recently gained its first impact factor of 9.194).

Though now just starting, Transactions on Machine Learning Research also goes in a similar direction in terms of accepting conference-length papers that are published as a journal, also reducing emphasis on impact, and rather focusing on technical correctness (though no conference presentations are offered yet).

Aside from publishing conference proceedings as journals, a number of top conferences in CS have introduced features that partially head in the direction of journals, including revisions, multiple review cycles, and so forth. These include SIGMOD, PODS, etc. Conferences such as AAAI have set up fast-track submissions of rejected papers from highly-competitive conferences, such as NeurIPS.

In summary, while the Semantic Web has stuck quite closely to a traditional conference/journal model, there are more hybrid precedents in other communities that might serve as a model going forward.

A concrete proposal

I will conclude with a concrete proposal on how publishing research about the Semantic Web could move forward, addressing not only the question of publishers and OA, but also some of the aforementioned pain points. The proposal is based on the principles of offering the community novel options to publish their research in an open way with no direct fees to authors nor readers. It is meant as food for thought, and not all questions will be answered. Details such as names, durations, publishers, etc., are not essential to the proposal.

The proposal involves three different types of publications within the Semantic Web community that are not necessarily dependent on each other. The third publication type is somewhat speculative, and not an essential part of the proposal (or could be postponed for future consideration).

Proceedings of the Semantic Web (PSW)

Proceedings of the Semantic Web (PSW) is intended to address the issue of conference publishing, in particular, for ISWC. I will refer to two options, both of which I feel provide a significant step forward, but where the second is the one I would argue best fits the needs of the community.

Option 1: Diamond OA CONFERENCE PROCEEDINGS

This option involves moving the proceedings from Springer to a Diamond OA conference proceedings. In this space, LIPIcs (Dagstuhl proceedings) is a natural alternative for a formal proceedings (with indexing, DOIs, existing prestigious conferences, etc.), where the EUR€60/paper fee could be covered through sponsorship or registration fees (ISWC currently publishes around 50 Main Track papers in the Springer proceedings each year, where EUR€3,000 would be sufficient to support Diamond OA for the full ISWC proceedings for one year). As aforementioned, many conferences have recently moved from Springer to LIPIcs, and this would keep the ISWC process more or less as they are while shifting towards a Diamond OA model.

The main downside is that conference papers are already undervalued even when published through a reputable company like Springer, and moving to a lesser known publisher may further aggravate this issue of how papers are valued by the individual institutions of researchers in our worldwide community.

As PC Co-Chair of the Research Track this year, together with Uli Sattler and Claudia d’Amato, we pushed to consider Option 1 for ISWC 2022, but there were concerns about loss of prestige and contracts; the conclusion was “not this year”. However, in writing this post, and in particular considering the major issue of how conference publications (particularly with non-traditional publishers) are undervalued in many settings, I am now starting to rather favour another option, as follows.

OPTION 2: Diamond OA Journal Proceedings

Option 2 is inspired by the model of the Proceedings of the VLDB Endowment (PVLDB), which is a Diamond OA journal that publishes the proceedings of the VLDB conference. In this proposal, PSW (like PVLDB) accepts conference-length submissions associated with ISWC that are published as issues of a journal. Effectively, PSW becomes the proceedings of ISWC, where accepted papers are presented at ISWC.

For researchers working in institutes where conference publications are worth less, or worthless, this avoids undervaluing their work, or having to go down the route of preparing extended journal papers just so their work gets adequately recognised by their institute or research funders. It may also help to mitigate the costs of moving to a Diamond OA publisher; unlike for conferences, in the case of journals, publisher-agnostic metrics such as impact factor (though problematic) are widely established. Even though impact factors and indexing take some years, in the meantime, PSW publications will still be considered the proceedings of ISWC, and benefit from its CORE A standing (as happens for PVLDB), etc.

This model would also lend itself well to the following (optional) features, as seen in PVLDB, that might ease some of the aforementioned pain points associated with conferences, and may merit further discussion in future (though it may make sense to postpone this and initially keep PSW reviewing cycles akin to the current ISWC process):

  • PSW could have more regular submission deadlines (e.g., monthly, bimonthly, trimonthly, or biannually).
  • PSW could have Chairs and a Programme Committee that are invited each year by an Advisory Board.
  • PSW could facilitate submissions of revised versions of papers, perhaps under a single-strike major revision process.

In terms of Diamond OA publishing, two interesting options would be MIT Press, or an Overlay Journal with Episciences. Being an established publishing house, MIT Press would lend the journal an extra prestige, and would offer print copies, but given the Gold OA costs, I suspect that quite a lot of money would need to be paid via sponsorship to avoid APCs for Diamond OA. On the other hand, Episciences should be a much cheaper option, and would allow greater control of formats, etc., but currently would lack printed versions, and currently lacks the vicarious prestige of an established publisher; however, as mentioned previously, the new journal(s) would be in good company with Episciences, considering the other excellent journals already associated with the organisation.

Transactions on the Semantic Web (TSW)

TSW would be a Diamond OA journal that implements a traditional journal model (similar to JWS), accepting longer-form papers, including research papers, survey papers, and potentially more (e.g., resource papers).

As suggestions for an optional structure and features:

  • The journal has an Editorial Board with senior members of the community, as well as a Review Board comprised of reviewers who are expected to provide around 2 reviews per year (thus reviewers gain credit through being part of the RB, and will hopefully be more “dependable”, reducing review turnaround times; JWS already implements a similar model). The EB and RB will span multiple years but it is expected that there will be regular changes.
  • Authors may submit summaries of accepted papers for consideration for presentation at conferences or other events (e.g., workshops) assuming no similar presentation has been given before.
  • Extended versions of accepted PSW papers are invited, and the same reviewers will be invited to review the paper (and maybe be invited to the RB, if not already present).

In terms of publishing, one idea would be to rather implement TSW as a “journal track” of PSW under Option 2, i.e., have one journal with a track for ISWC papers, and another open track that accepts traditional journal papers. However, this would likely affect PSW being dually considered as a conference/journal proceedings (like PVLDB). Hence it might be better to define TSW as a separate (but related) journal publishing under a Diamond OA model, again with MIT Press, or with Episciences. Again, the publisher should not matter so much once papers are indexed, an impact factor is established, etc.

If possible, in order to avoid a “metric reset” and “legacy issues” (see above), it would be great to pivot JWS towards this model in order to inherit impact factors, indexing, etc., as a base, and also to “grandfather” existing JWS publications as part of the new journal. It is not clear that this is feasible, however, but I think that we should, in any case, implement a Diamond OA journal for the Semantic Web along the lines proposed here for TWS.

Letters on the Semantic Web (LSW)

Though not an essential part of the proposal, and one that could potentially be postponed for discussion later, the idea of LSW would be to act as a conference proceedings or journal that follows the model of TMLR. This venue will deemphasise aspects such as potential impact and technical depth, rather putting emphasis on aspects such as novelty and technical correctness. The idea here is to keep PSW and TSW as highly selective venues with excellent metrics, but at the same time address the pain points associated with this high selectivity, giving the option of Diamond OA publishing for papers that are deemed technically sound and novel, but not necessarily meeting the “bar” for PSW and TSW.

Submissions to TSW or PSW that are rather deemed to better meet the criteria of LSW could be fast-tracked for publication in LSW. Authors may optionally submit summaries of accepted papers in order to apply for presentation at conferences or other events (e.g., workshops) assuming no similar presentation has been given before. LSW could also potentially publish new types of papers not well-covered elsewhere, such as position papers, resource papers, tutorial papers, reviews, etc., perhaps similar to something like CACM or SIGMOD Record.

(Another name might be Communications on the Semantic Web (CSW).)

[UPDATE 2022-11-18: Axel mentions in the comments that, although published with Springer, the BPM conference implements a similar 2-tier model, where papers can be accepted for the main LNCS proceedings, or a secondary BPM Forum proceedings published by LNBIP. See example from 2022.]

Summary of proposal and outlook

The PSW/TSW/LSW proposal outlined here is intended as food for thought, and fodder for discussion and critique. There are a lot of issues left unaddressed here, such as the relation to our existing conferences/journals/organisations, the question of how to avoid a metric reset (if possible), details like reviewer loads, rebuttal or no, page limits, whether or not we should try to reach out to other communities (e.g., via knowledge graphs), what papers other than research papers to accept (if any), etc., etc.

In terms of concrete next steps:

  • Investigate options and opinions regarding PSW (for ISWC):
    • Check with SWSA existing constraints with respect to Springer and ISWC proceedings.
    • Explore and compare in more depth Option 1 (Diamond OA conference proceedings) and Option 2 (Diamond OA journal proceedings), as well as concrete publisher options.
    • Upon choosing an option and publisher, figure out details such as naming, options for more regular submissions, revision process, etc. Possibly start with annual calls and no revisions, per ISWC currently, and rather implement more gradual changes over the years (e.g., perhaps towards more regular calls, revision processes, etc.).
  • Investigate options and opinions regarding TSW (for JWS):
    • Check to see under what conditions, if any, JWS could continue while changing publisher (keeping indexing, impact factor, etc.). Decide on continuing JWS (if possible) or moving to a new journal.
    • Explore and compare in more depth Diamond OA options, such as MIT Press and Episciences. Understand in more detail the financial/sponsorship requirements for Diamond OA under MIT Press; if a potential option, perhaps coordinate with the Editors-in-Chief of Data Intelligence to seek possible synergies or feedback. Otherwise understand in more detail the processes underlying Episciences.
    • Consult with existing JWS boards (EiCs, ACs, EBMs) in light of a concrete proposal in order to understand opinions regarding making a switch, the future of JWS, etc. If a critical mass of consensus exists, change publisher or start the new journal, and implement the switch.
  • Investigate options and opinions regarding LSW. This can potentially be postponed for now, and returned to in future.

Perhaps we might also want to consider how ISWC/PSW could relate to ACM, and to organisations such as SIGWEB. There might be possible synergies with the World Wide Web Conference (WWW/TheWebConf).

It would also be great in any new initiative to seek opportunities to commemorate Aaron Swartz, his contributions to the community, and the values he fought for, but only with permission, with caution, and in a way that preserves the principles he espoused.

Feel free to drop some comments below.

Acknowledgments

Sitting down to create a blog, organise my thoughts, and write this blog post was very much motivated by a recent JWS meeting and conversations surrounding the Townhall at ISWC 2022. I just wanted to acknowledge a number of people like Ian Horrocks, Uli Sattler, Claudia d’Amato, Axel Polleres, C. Maria Keet, and more besides, discussions with whom influenced and motivated this post. In some cases I do not connect specific people to specific points (in an overabundance of precaution) as they came up in private or otherwise closed conversations.

,

8 responses to “Publishing research about the Semantic Web”

  1. Hi!

    There is a small bullet point for CEUR-WS in the blog.

    I like to mention that the minimum requirements to publish a volume at CEUR-WS are really easy to meet. A workshop that fails to pass the minimum requirements should in my view not take place or rather have no proceedings published at all.

    The semantic web community has been publishing tons of volumes with CEUR-WS. We are happy to serve this community further. If the community wants to publish conference proceedings with us (e.g. to evade paywalls), then this is in principal possible. A frame agreement would be possible to give the semantic web community more peace of mind that those conference proceedings (possibly with companion volumes for the workshops affiliated to the conference) are getting published at CEUR-WS.

    DBLP indexes most volumes of CEUR-WS. They have a certain backlog. Sometimes the meta data is corrupted, I heard. This cannot easily be solved by us due to lack of person power. However, I believe that practically all volumes coming from workshops affiliated to known conferences are getting indexed.

    We welcome contributions from the semantic web community to ensure the quality of the meta data and streamline the indexing of volumes by DBLP.

    Kind greetings, Manfred (founder of CEUR-WS and chair of the CEUR-WS advisory team)

    • Hi Manfred, first of all thanks for the great service with CEUR-WS! Regarding CEUR-WS being a bullet-point in this blog post, I think this can perhaps be seen as a positive. The focus here is mostly on the issue for conferences and journals, whereas CEUR-WS is explicitly for workshop proceedings, and hence did not appear to me to be an option for the likes of ISWC or JWS. The situation for workshop proceedings, with CEUR-WS, is already pretty good, hence why it was not a focus here. However, if you say that in future, a similar option could be made available for conferences, that sounds interesting! I will edit the blog post to mention this.

      In terms of my own wish-list for CEUR-WS, it would be great to see more complete coverage in DBLP. You mention that almost all CEUR-WS proceedings are indexed in DBLP, but if I look at CEUR-WS Volumes 2900-2999, for example, I found 73/100 of the volumes indexed (https://dblp.org/db/series/ceurws/ceurws2900-2999.html). For volumes 2800-2899, 61/100 are indexed (https://dblp.org/db/series/ceurws/ceurws2800-2899.html). But perhaps, as you say, proceedings for workshops in well-known conferences are well covered?

      Regarding “A workshop that fails to pass the minimum requirements should in my view not take place or rather have no proceedings published at all.”, understood! However, this can sometimes end up being a bit harsh for authors who have a paper accepted at a workshop that received few submissions (a case common at conferences like ISWC, where workshops are often used to foster new sub-communities). In this case, the authors’ paper will not be published, through no fault of their own. I think the solution applied at ISWC and ESWC has been to join these smaller workshops into one proceedings, which is a reasonable solution, but can get a bit messy.

      Best, and thanks again for your work with CEUR-WS!
      Aidan

      • Hi Aidan,

        I did not follow the coverage of CEUR-WS in DBLP recently. The situation may be a bit worse than I thought. DBLP has the goal to cover all of CEUR-WS but if the data import fails (e.g. a malformed index.html file), then they may skip it and fail to revisit it.

        At the end it is a question of available person power. We at CEUR-WS have not the time to keep track of DBLP. We ask the volume editors to take care of this themselves.

        Yes, we would publish conference proceedings. A frame agreement would be useful to to so, i.e. some dedicated persons from the semantic web community take charge of a sub-series “Semantic Web Proceedings” at CEUR-WS and process the submissions. We already have such sub-series for AIxIA. The dedicated persons would become CEUR-WS team members and would be responsible for executing the publication process. Takes about 15-30 minutes per volume, if there are no major errors in a submission.

        Workshops affiliated to the same conference could be aggregated in a companion volume rather than individual small volumes. This would solve the problem of authors in workshops that do not attract many papers. Still, I believe that workshops with too few papers should be cancelled, because there will not be enough discussion.

        Kind greetings, Manfred

  2. Dear Aidan,

    Congrats, great piece which I really enjoyed reading!

    Two small remarks:

    1)
    FWIW, I would add one more point in “Journal pain-points”
    the strict deadlines in conference revieweing and – at least perceived – less strict deadlines in journal reviews, community journals slower overall in CS, compared to other communities. I have to admit that I only have annectotal proof for this, it was mentioned I believe in the “FLoC Panel: Publication Models in Computing Research: Is a Change Needed? Are We Ready for a Change?” (https://login.easychair.org/smart-program/VSL2014/ICLP-2014-07-17.html) on the issue at Vienna’s “summer of logic” in 2014.

    I think this is an important factor across conferences and journal in a confrerence-culture heavy community like ours…. could be added under “Poor reviewer incentives:”.

    2) While sthey’re still exclusively publishing with Springer, I found the 2-tier publishing model of BPM conference very interesting, where papers can be accepted as full conference papers (in Springer’s LNCS) or in the “BPM Forum”, see https://dblp.org/db/conf/bpm/index.html for less mature papers (published in Springers less prestigeous(?) “Lecture Notes in Business Information Processing” series.
    This allows them to accept more papers, and would be combinable IMHO with a journal model for the top tier papers.
    I find this model interesting to consider when re-thinking a mixed journal/conference model and it would be interesting to explore routes to combine with the VLDB model, or in other words would lend itself to a combination of Option1+Option2 as one.

    just my two cents,

    Axel

    • Hey Axel, thanks!

      Regarding point 1, perhaps it’s covered by “Unpredictable timelines”? I.e., this is the consequence that is painful? I think maybe point 1, if I understood, is another underlying reason for unpredictable timelines.

      Regarding point 2, thanks! Even though it’s still a Springer proceedings, I think mentioning the two-tier BPM model in the latter part of the discussion is interesting. It kind of ties in with the more speculative “LSW” suggestion.

      Best,
      Aidan

  3. Hi Aidan,

    to complement a bit your thinking about APCs, there is a lot of work done by the library of Bielefeld U. about it [1].
    The data are browsable by institution, journal, year and offer a nice global picture. It is AFAIK not available in RDF and a bit more info about provenance would make it perfect (it seems however, that my institution communicates this information to them).

    [1] https://treemaps.intact-project.org/apcdata/openapc/#institution/

    • Hey Jérôme, many thanks! That’s a very interesting website. I added an update to the post to point to the statistics provided there. (APCs are now big business it seems.)

Leave a Reply

Your email address will not be published.