This is an annoucement of a new mailing list, and a proposal for three things:
The new mailing list is for the discussion of these proposals. Please subscribe if you are interested. Follow-ups set accordingly.
The purpose for this new list is to:
To subscribe: http://haskell.org/mailman/listinfo/libraries/
Everyone agrees that Haskell needs good, useful, libraries: lots of them, well-specified, well-implemented, well-documented. A problem is that the current "Standard Libraries" defined by the Haskell'98 Report number only about a dozen. But there are actually many more libraries out there: some are in GHC's hslibs collection, others are linked from haskell.org, even more are used only by their original author and have no public distribution.
What is more, there is no Haskell Committee. There is no-one to decide which candidate libraries are worthy to be added to the "Standard" set. This stifles the possible distribution of great libraries, because no-one knows how to get my library "accepted".
Furthermore, the existing libraries that people distribute from their own websites often run into problems when used alongside other people's libraries. A library usually consists of several modules, but often the constituent modules have simple names that can easily clash with modules from another library package. This leads people to ad hoc solutions such as prefixing all their modules with a cryptic identifier e.g.
HsParse
XmlParse
HOGLParse
THIHParse
Just counting the libraries currently available from GHC's hslibs, and haskell.org's links, there are currently over 200 separate modules in semi-"standard" use. As more libraries are written, the possibility of clashes can only increase.
Related to this problem, although not identical, is the difficulty of finding a library that provides exactly the functionality you need to help you write a specific application program. How do you go about searching through 200+ modules for interesting-looking datatypes and signatures, starting only from the module names?
My view is that many of these problems are rooted in Haskell's restriction to a flat module namespace. If we can address that issue adequately, then I believe that many of the difficulties surrounding the provision of good libraries for Haskell will simply fall away.
Introduce nested namespaces for modules. The key concept here is to map the module namespace into a hierarchical directory-like structure. I propose using the dot as a separator, analogous to Java's usage for namespaces.
So for instance, the four example module names above using cryptic prefixes could perhaps be more clearly named
Haskell.Language.Parse
Text.Xml.Parse
Graphics.Drawing.HOpenGL.ConfigFile.Parse
TypeSystem.Parse
Naming proceeds from the most general category on the left, through more specific subdivisions towards the right.
For most compilers and interpreters, this extended module namespace maps directly to a directory/file structure in which the modules are stored. Storing unrelated modules in separate directories (and related modules in the same directory) is a useful and common practice when engineering large systems.
(But note that, just as Haskell'98 does not insist that modules live in files of the same name, this proposal does not insist on it either. However, we expect most tools to use the close correspondance to their advantage.)
There are several issues arising from the particular proposal here.
modid -> qconidwhere currently the syntax is
modid -> conid
import qualified XmlParse
... XmlParse.element f ...
becomes
import qualified Text.Xml.Parse
... Text.Xml.Parse.element f ...
However, I propose that every import have an implicit "as"
clause to use as an abbreviation, so in
import qualified Text.Xml.Parse [ as Parse ]the clause "as Parse" would be implicit, unless overridden by the programmer with her own "as" clause. The implicit "as" clause always uses the final subdivision of the module name. So for instance, either the fully-qualified or abbreviated-qualified names
Text.Xml.Parse.element
Parse.element
would be accepted and have the same referent, but a partial
qualification like
Xml.Parse.elementwould not be accepted.
A.B.C.Din Haskell'98 means the composition of constructor D from module C, with constructor B from module A:
(.) A.B C.DNo-one so far thinks this is any great loss, and if you really want to say the latter, you still can by simply inserting spaces:
A.B . C.D
Further down this document, I give more motivation and a rationale for this proposal of nested namespaces. But first, two other proposals which rest on the first one.
Adopt a standardised namespace layout to help those looking for or writing libraries, and a "Std." namespace prefix for genuinely standard libraries. (These are two different things.)
The hslibs collection of modules is a great starting place for finding common libraries that could become standards. I propose that we adopt a "standardised" namespace hierarchy, based on the current hslibs layout, into which Haskell programmers can plug their own libraries relatively easily (whether they intend to release them or not). The aim is to make it clear where to place a new module, and where to search for a possible existing module.
For instance, in ASCII art, here is a small part of a suggested tree.
+ Data + Structures + Trees + AVL
| | | + RedBlack
| | |
| | + Queue + Bankers
| | + FIFO
| + Encoding + Binary
| + MD5
|
+ Graphics + UI + Gtk + Widget
| | | + Pane
| | | + Text
| | |
| | + FranTk
| |
| + Drawing + HOpenGL + ....
| | + Vector
| |
| + Format + Jpeg
| + PPM
+ Haskell + ....
|
A fuller proposed layout appears on the web at
http://www.cs.york.ac.uk/fp/libraries/layout.html.
Simon Marlow suggests an alternative layout at
http://www.cs.york.ac.uk/fp/libraries/layoutSM.html
In addition to a standardised hierarchy layout, I propose a truly Standard-with-a-capital-S namespace. A separate discussion is needed on what exactly would consitute "Standard" quality, but by analogy with Java where everything beginning "java." is sanctioned by Sun, I propose that every module name beginning "Std." is in some sense sanctioned by the whole Haskell community.
So for instance, an experimental, or not-quite-complete, library could be called
Text.Xml
but only a guaranteed-to-be-stable, complete, library could be called
Std.Text.Xml
The implication of the Std. namespace is that all such "standard" libraries will be distributed with all Haskell systems. In other words, you can rely on a standard library always being there, and always having the same interface on all systems.
Develop a process by which candidate libraries can be proposed to enter the Std. namespace.
Since Haskell'98 is fixed, and there is no longer a Haskell Committee, there is no official body capable of deciding new standards for libraries. However, we do have a Haskell community which will use or not use libraries, depending on their quality. So libraries will become standards by a de-facto process, rather than de-jure.
Apart from the Haskell compiler implementers, we wanted a means to encourage the whole community to be involved in recognising de facto "standard" libraries. The mailing list 'libraries@haskell.org' is one contribution. We hope this will work on the same model as the FFI mailing list, which has been pretty successful at allowing a community of designers and implementers to explore their FFI needs and solidify a design that is common across at least three Haskell systems.
On top of this discussion however, some final decisions will have to be made on which libraries achieve entry to the "Std." namespace. The Haskell implementers have collectively proposed a ruling troika, one representing each of the three main Haskell systems (Hugs,ghc,nhc98). These are Simon Marlow, representing ghc, and current keeper of the hslibs collection; Malcolm Wallace, representing nhc98; and Andy Gill, representing Hugs users.
Some obvious criteria for entry to the "Std." namespace would be:
These suggested criteria need some discussion and improvement.
After the initial period of deciding what belongs in the "Std." namespace, I would expect any further candidate libraries that are proposed for standardisation to spend some time in another part of the namespace hierarchy whilst they gain stability and common acceptance, before being moved to "Std.".
Imagine you have just written a new library of, say, pretty-printing combinators. You want to release it to the Haskell public. So what do you call it?
module Pretty -- already taken (several times)
module UU_Pretty -- also taken
module PrettyLib -- already exists as well
Ok, so lacking any further inspiration, you end up deciding to call it
module MyPretty -- !
Surely there must be a better solution. Of course there is - namespaces. Let's classify libraries that do similar jobs together:
module Text.PrettyPrinter.Hughes -- the original Hughes design
module Text.PrettyPrinter.HughesPJ -- later modified by Simon PJ
module Text.PrettyPrinter.UU -- the Utrecht design
module Text.PrettyPrinter.Chitil -- Olaf's new design
These are exactly the same Pretty libs as before, but named more sensibly. It is still clear that each is a pretty-printing library, but it is also clear that they are different.
Incidentally, have you ever tried to write your own module called Pretty? You may have discovered with GHC (which has a Pretty already in the hslibs collection), that you get strange errors. This is because sometimes the compiler can be confused into reading one Pretty.hi interface file (i.e. yours), yet linking the other Pretty.o object file (i.e. from hslibs), ending in a core dump. With proper module namespaces, this confusion should never happen again.
You are writing a complex library that has a couple of layers of abstraction. For some users, you want to expose just a small high-level set of types and functions. Other users will need more detailed access to lower-level stuff.
With namespaces, you can use the directory-like structure to make these kinds of access explicit. For instance, imagine a socket library:
module Network.Socket
It exports an abstract type Socket for ordinary
users - they only need to know its name. More advanced hackers
however can play with the details of the type, because you also have:
module Network.Socket.Types
which exports the Socket type non-abstractly
i.e. Socket(..). And of course this abstraction is easy for
the library-writer to manage, because the implementation of the more
abstract layer simply imports and re-exports a careful selection of
the more detailed layers.
Don't forget that, in terms of the actual filesystem layout, it is perfectly OK to have e.g.
file Network/Socket.hs
dir NetWork/Socket
file Network/Socket/Types.hs
You are managing a software engineering project. Several people are working more-or-less independently on different sections of the program. To avoid mistakes with files, you give each one a separate directory to place their code in. But in Haskell'98 this is not enough to ensure that they invent module names that do not clash with other developers' modules. So you insist that everyone also uses a prefix-naming scheme for each appropriate sub-task.
For instance, here is a sketch of the layout of the Galois Connection team's entry in the ICFP 2000 programming contest:
dir CSG -- constructive solid geometry
file CSG/CSG.hs
file CSG/CSGConstruct.hs
file CSG/CSGGeometry.hs
file CSG/CSGInterval.hs
dir Fran -- Fran-style animation
file Fran/FranLite.hs
file Fran/FranCSG.hs
dir GML -- interpreter for little language
file GML/GMLData.hs
file GML/GMLParse.hs
file GML/GMLPrimitives.hs
So now the problem is that to actually build the software, you need to write a Makefile that descends into these directories. Or maybe you use 'hmake' like so:
hmake examples/chess.hs -ICSG -IFran -IGML -IRayTrace -package text
Note how many sub-directories you must remember to add to the command line (this applies equally for compiler options in Makefiles). Note also the inconsistency between compiling and linking my modules, against using and linking a "standard" hslibs module from package text.
Isn't there a simpler way? Yes. Namespaces. Prefix naming is no longer needed inside directories, because the directory name is part of the module name:
file CSG.hs -- re-exports everything from the CSG dir
dir CSG
file CSG/Construct.hs
file CSG/Geometry.hs
file CSG/Interval.hs
dir Fran
file Fran/Lite.hs
file Fran/CSG.hs -- does not conflict with top-level CSG.hs
dir GML
file GML/Data.hs
file GML/Parse.hs
file GML/Primitives.hs
And now, the commandline to 'hmake' (or compiler options in a Makefile) becomes simply:
hmake examples/chess.hs -I.
You only need to specify the root of the module tree (-I.), and all modules in all subdirectories can be found via their full namespace path as used in the source files. Note also that, whereas previously we needed to specify a package for whatever hslibs modules were used, now the compiler/hmake already knows the root of the installed hslibs tree and can use the same mechanism to find and link "standard" modules as for user modules.
From this example it should be clear that the use of module namespaces is of benefit to ordinary programs that may never become public, quite aside from any benefits we expect to derive in managing publically-distributed library code.
Ok, so that's my proposal. The implementers of some of the main Haskell systems have seen a presentation of these ideas, and seemed to like them. Namespaces are already implemented in nhc98 (v1.02) and hmake (v2.02) if you want to play with them. I expect some discussion to refine this proposal on the 'libraries@haskell.org' list, to which everyone interested is invited.
Once we have nailed down the precise design, we need to get matching implementations in all systems. I have rashly volunteered to implement the lexical/parsing/module-search changes in any Haskell system that no-one else volunteers for (probably ghc, Hugs, possibly hbc).
But after that we will still have many more decisions to take about individual libraries, precise naming, build systems, and so on, not to mention actually writing the libraries. Get involved. Contribute.