Date of Effect XX October 2004
Contact IT Section
> CONTENTS
1 > PURPOSE 1
2 > SCOPE 1
3
>
METADATA ELEMENTS 1
4 > ELEMENT
EXPLAINATIONS 1
4.1 > IDENTIFIER 2
4.2 > TITLE 3
4.3 > ALTERNATIVE 4
4.4 > SUBJECT 5
4.5 > DESCRIPTION 6
4.6 > SOURCE 7
4.7 > AUDIENCE 9
4.8 > DATE 10
4.9 > LANGUAGE 12
4.10 > TYPE 13
4.11 > FORMAT 14
4.12 > STATUS 15
4.13 > COVERAGE 16
4.14 > PRIORITY 20
4.15 > DISPOSAL 21
4.16 > LOCATION 22
4.17 > RELATION 23
4.18 > RIGHTS 24
4.19 > OCHA CLASSIFICATION 25
4.20 > UN RECORD REFERENCE 26
5 > INDEXING
RULES 27
ANNEX 1 > METADATA ELEMENT
STRUCTURE 28
ANNEX 2 > CONTROLLED
VOCABULARY 29
ANNEX 3 > ANOTHER
VIEW OF DUBLIN
1 > PURPOSE
Metadata is very
commonly described as “data about data”, but more specifically it can be defined
as the data describing context, content and structure of documents and/or
records and their management throughout their useful and archival life. In recent years, various UN groups have
developed customized sets of metadata (schemas) with each fulfilling a specific
purpose. By defining a general OCHA metadata standard, OCHA aims to provide a
flexible standard that all OCHA activities can eventually adhere to and thus
improve such things as data retrieval, system integration, and system
usability.
2 > SCOPE
This standard outlines the different metadata elements that
should be adopted throughout OCHA. The
standard attempts to remain flexible while outlining multiple layers relating
to different foreseen object types.
The standard outlines both mandatory and optional metadata
elements. As it is known that very
different object types exist (i.e. map versus
OCHA systems and procedures should be
aligned with this policy. For example,
any electronic system that produces, uses, or stores data files, must
incorporate the need to include metadata.
It is understood that at the time of drafting this standard, all
existing systems in OCHA do not adhere to this standard. It is expected that either these systems will
find a method to map to the standard or incorporate the standard in future
versions.
3 > METADATA ELEMENTS
The proposed OCHA metadata elements were developed after reviews
of the UN Standard on Recordkeeping Metadata, Relief-Web metadata, the Field
Information Support Metadata Working Paper, the OCHA document repository, and
the correspondence repository (Registry).
The standard has been developed in compliance with the Dublin Core
Metadata Initiative (see Annex 3 for graphical representation of Dublin Core
and metadata benefits) and ISO 15836:2003(E) – “Information and documentation
-The Dublin Core metadata element”.
The elements listed below are broken into mandatory and
optional categories. It must be noted
that different elements are required depending on the object in question. The conditional category notes elements that
are mandatory in certain instances, but not in others.
Mandatory |
Conditional |
Optional |
Identifier |
Audience |
Alternative |
Title |
Language |
Description |
Subject |
Format |
Status |
Source |
Rights |
Coverage |
Date |
|
Priority |
Type |
|
Disposal |
Location |
|
Relation |
|
|
OCAH Classification |
|
|
UN Record Reference |
|
|
Metadata Stamp |
Note that
several of the elements listed in the table above are composed of refined
elements and thus their condition is defined on the refined elements rather
than on the general element.
4 > ELEMENT
EXPLAINATIONS
4.1 > Identifier
Identifier |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
Definition |
A unique reference
for a given object |
||||||||||||||||||||||||||||||||||||||||||||||||||||||
Purpose |
To distinguish one
object from another |
||||||||||||||||||||||||||||||||||||||||||||||||||||||
Use |
Use refined
elements (below) |
||||||||||||||||||||||||||||||||||||||||||||||||||||||
Refined Elements |
|
4.2 > Title
Title |
|
Definition |
Full “official”
title, as it appears, on the document |
Purpose |
To assist in
identification and retrieval |
Use |
Free text as it
must include the object’s title |
Obligation |
Mandatory |
Repeatable |
No |
Default Value |
None |
Source/Scheme |
User defined, but
automatic capture recommended when feasible system solutions |
Comments |
It is always
recommended to use meaningful titles.
If the “official” title does not cover all elements of the object or
is not deemed meaningful, then the “Alternative” and “Description” elements
should be used. |
Example |
OCHA – Archiving
Policy – version 1 |
4.3 > Alternative
Alternative |
|
Definition |
A substitute or alternative to the “official”
title of the object |
Purpose |
To provide a more
“user-friendly” or more “appropriate” title on the object |
Use |
Free text |
Obligation |
Optional |
Repeatable |
No |
Default Value |
None |
Source/Scheme |
User defined |
Comments |
This element can include
abbreviations and translations |
Example |
Official Title: Consolidated Appeal Process –
Democratic People’s Republic of Korea – 2004 Alternative: |
4.4 > Subject
Subject |
|
Definition |
Keywords or key phrases that describe the
content/subject of the object |
Purpose |
To provide a more
structured or ordered method for retrieval than Title or Alternative can
provide |
Use |
Selected from a
pre-determined controlled vocabulary |
Obligation |
Mandatory |
Repeatable |
Yes |
Default Value |
None |
Source/Scheme |
User chooses from
list of controlled vocabulary (see Annex 2.) |
Comments |
The controlled
vocabulary encompasses all OCHA related business and should only be updateable
at a central source (rather than in each application) |
Example |
Appeal, Disaster,
Relief, Environment, Children, etc., etc. |
4.5 > Description
Description |
|
Definition |
Account of the object’s content |
Purpose |
To provide
additional information regarding the object and provide an additional means
for search and retrieval |
Use |
Free text |
Obligation |
Optional |
Repeatable |
Yes |
Default Value |
None |
Source/Scheme |
User enters free text
to better describe the object than the Title, Alternative, and Subject
elements have. |
Comments |
Can include such
things as an abstract or table of contents, but should not duplicate the
information in the Title, Alternative or Subject elements |
Example |
Abstract from this document: “This standard outlines the different
metadata elements that should be adopted by the OCHA Document
Repository. The standard attempts to
remain flexible while outlining multiple layers relating to different
foreseen document types.” |
4.6 > Source
Source |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
Definition |
Entity from which
the content was derived or created |
||||||||||||||||||||||||||||||||||||||||||||||||||||||
Purpose |
To identify where
the object has come from |
||||||||||||||||||||||||||||||||||||||||||||||||||||||
Use |
Use refined
elements (below) |
||||||||||||||||||||||||||||||||||||||||||||||||||||||
Refined Elements |
continued…
|
4.7 > Audience
Audience |
|||||||||||||||||||||||||||||||||||||
Definition |
The entity for whom
the resource is intended or useful |
||||||||||||||||||||||||||||||||||||
Purpose |
To capture whom the
object is intended |
||||||||||||||||||||||||||||||||||||
Use |
Use refined
elements (below) |
||||||||||||||||||||||||||||||||||||
Refined Elements |
|
4.8 > Date
Date |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Definition |
A given date
related to a point in the life cycle of the object |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Purpose |
To identify
significant dates in relation to the object |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Use |
Use refined
elements (below) |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Refined Elements |
|
4.9 > Language
Language |
|
Definition |
The language of the
object’s content |
Purpose |
To capture
authoring language of the object |
Use |
Controlled
vocabulary from ISO639 Standard |
Obligation |
Mandatory for non-inventory items |
Repeatable |
Yes |
Default Value |
None |
Source/Scheme |
ISO639-2, the
alpha-3 language code, is recommended.
The ISO639-1, the alpha-2 language code, is option. (See Annex 2.3) |
Comments |
Highly recommended
to use the ISO639-2, but consideration must be given in all systems for
future connectivity with other systems that are using ISO639-1 |
Examples |
ISO639-2 – English is translated to “eng” ISO639-1 – French is translated to “fr” |
4.10 > Type
Type |
|||||||||||||||||||||||||||||||||||||
Definition |
The structure that
the object takes |
||||||||||||||||||||||||||||||||||||
Purpose |
To help in the
interpretation of the objects contents and future retrieval |
||||||||||||||||||||||||||||||||||||
Use |
Use refined
elements (below) |
||||||||||||||||||||||||||||||||||||
Refined Elements |
|
4.11 > Format
Format |
|||||||||||||||||||||||
Definition |
The media-type in
which the object’s contents are stored as well as size of the object when
applicable |
||||||||||||||||||||||
Purpose |
To help in the
storage and retrieval of the given object |
||||||||||||||||||||||
Use |
Use refined
elements (below) |
||||||||||||||||||||||
Refined Elements |
|
||||||||||||||||||||||
Specific Refined
Elements |
As can bee seen
above, each object type may have several specific elements. For that reason only a sample is given
here. Systems can track their own
details, but it would be recommended to ensure details are stored in a
commonly accepted standard/format. |
4.12 > Status
Status |
|
Definition |
Determines the
current status/availability of the given object |
Purpose |
To define the
situation of the object and determine relevance during retrieval |
Use |
Select from pre-defined
controlled vocabulary |
Obligation |
Optional |
Repeatable |
No |
Default Value |
None |
Source/Scheme |
Controlled
vocabulary (See Annex 2.6) |
Comments |
This element will help
in controlling the use of the object, in search and retrieval, and in
determining relevance by a given system user |
Examples |
Draft, Pending Approval, Approved, Published |
4.13 > Coverage
Coverage |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Definition |
Describes the spatial
aspects of the given object |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Purpose |
To outline
geographical information about the object |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Use |
Use refined
elements (below) |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Refined Elements |
|
4.14 > Priority
Priority |
|
Definition |
A ranking that
distinguished between objects’ importance |
Purpose |
To ensure their
inclusion in search results on certain topics |
Use |
Select from
pre-defined controlled vocabulary |
Obligation |
Optional |
Repeatable |
No |
Default Value |
None |
Source/Scheme |
Controlled
vocabulary (See Annex 2.8) |
Comments |
Using this element
helps ensures that important, relevant documents are the most visible whereas
the obscure documents that need to be kept but are probably not relevant to
the vast majority of the users can be given a lower-than-normal ranking |
Examples |
Low, Normal, Important, High |
4.15 > Disposal
Date |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Definition |
Details regarding
the retention and disposal of a given object |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Purpose |
To ensure proper
record retention and disposal |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Use |
Use refined
elements (below) |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Refined Elements |
|
4.16 > Location
Location |
|||||||||||||||||||||||||||||||||||||
Definition |
The virtual and/or
physical location of a given object |
||||||||||||||||||||||||||||||||||||
Purpose |
To help retrieval and
verification of a given object |
||||||||||||||||||||||||||||||||||||
Use |
Use refined
elements (below) |
||||||||||||||||||||||||||||||||||||
Refined Elements |
|
4.17 > Relation
Relation |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
Definition |
Any related object
to the given object |
||||||||||||||||||||||||||||||||||||||||||||||||||||||
Purpose |
To link related
objects and provide a “related to” search/retrieval function |
||||||||||||||||||||||||||||||||||||||||||||||||||||||
Use |
Use refined
elements (below) |
||||||||||||||||||||||||||||||||||||||||||||||||||||||
Refined Elements |
|
4.18 > Rights
Rights |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
Definition |
Determines the
rights, including copyrights, security, and responsibility of a given object |
||||||||||||||||||||||||||||||||||||||||||||||||||||||
Purpose |
To ensure object is
proper use, security and responsibility are assigned to the object |
||||||||||||||||||||||||||||||||||||||||||||||||||||||
Use |
Use refined
elements (below) (See Annex 2.10) |
||||||||||||||||||||||||||||||||||||||||||||||||||||||
Refined Elements |
|
4.19 > OCHA Classification
OCHA Classification |
|
Definition |
A classification
assigned to the object based on OCHA’s classification scheme |
Purpose |
To help classify
the object based on OCHA’s business lines and provide an additional parameter
for future retrieval |
Use |
Should either be system
generated by the system in use or must be retrieved from a separate system |
Obligation |
Optional |
Repeatable |
No |
Default Value |
None |
Source/Scheme |
OCHA Classification
(See Annex 2.11) |
Comments |
This element will help
in filing and future retrieval by allowing users to think according to OCHA’s
business lines |
Examples |
Accounting - Payroll |
4.20 > UN Record Reference
UN Record Reference |
|
Definition |
An ID assigned by the
UN ARMS group should the record be archived into the UN ARMS |
Purpose |
To help facilitate
the retrieval process of an physically archived object that is held in the UN
Archives |
Use |
Will be
communicated from the UN ARMS |
Obligation |
Optional |
Repeatable |
No |
Default Value |
None |
Source/Scheme |
UN ARMS ID
classification |
Comments |
Should a physically
archived object need to be retrieved from the UN Archives, this reference
will greatly help in the retrieval request an process |
Examples |
- |
5 > INDEXING RULES
When entering the
metadata, or indexing, for a given object there are four basic rules to follow.
Rule |
Description |
Specificity Rule |
Apply
the most specific terms when tagging objects.
Specific terms can always be generalized, but generic terms cannot be
specialised. |
Repeatable
Rule |
Use
as many terms as necessary to describe “What the object is about” and
“Why it is important”. Storage
is cheap. Re-creating content is
expensive |
Appropriateness
Rule |
Not
all attributes apply to all objects.
Only supply values for attributes that make sense. |
Usability
Rule |
Anticipate
“how the asset will be searched for” in the future, and “how to
make it easy to find”. Remember
that search engines can only operate on explicit information. |
ANNEX 1 > METADATA ELEMENT STRUCTURE
ANNEX 2 > CONTROLLED VOCABULARY
Annex 2.1 > Subject
When considering the
controlled vocabulary for the Subject metadata element, an organization must
consider all aspects of its business. Using
a pre-defined set of keywords in any system will improve searching abilities as
well as data exchange between different systems.
Even with a complete
master list in place the expansiveness of this list will be rather overwhelming
even for a regular power-user. So
consider what a regular user might experience when trying to find the right
keywords – confusion, frustration, anger, and dissatisfaction. From that point, one realizes that any system
must be able to provide a subset of the keywords related to the business/system
at hand.
In the FIS Metadata
Working Paper under the “Topics and Keywords” section, the suggested approach
is to group the keywords into different topics.
In their example, they have separated their topics and keywords by object
type first (essentially filtering the keywords as appropriate). But, one problem still remains – different
object types will require the same topics and keywords. In their example, one Topic is listed as
“Administration” where as in another instance it is listed as “Administration
& Management” – something a controlled vocabulary should not do.
The FIS example
points out three things:
1.
A central/master keyword repository is required
2. Keywords
should be grouped into categories to allow easier understanding
3.
Keywords should be allowed to below to multiple
categories – allows related to, find items similar to, find categories related
to, etc.
ReliefWeb has a
clearly defined set of subject keywords.
As well, they have recently undergone a thorough reorganization of their
website and related materials. Their
keywords should be incorporated into any keyword listed used inside OCHA.
At the
moment, the Registry personnel are drawing up a classification scheme that will
attempt to cover all of OCHA’s business while adhering to UN standards. Once this scheme has been completed, it
should be used, along with the FIS and ReliefWeb keywords, to develop a master
list of Categories and Keywords. Due to
the expected size and expected regular use of the subject keywords, the listed
will be presented in its own document.
Annex 2.2 > Date(s)
Although not true to
the sense of a controlled vocabulary, any date entered into a system should be
always be controlled. Given that the
user can enter any date they wish (vocabulary), it is recommended that the
system ensure that the date is valid (not February 30) and is controlled
format.
The UN Standard data
format must be used “DD-MM-YYYY”. As
long the system can ensure easy understanding of the given date, can present
the date in a UN Standard format, and controls the entry to ensure no mistakes,
then it will be deemed as sufficient.
Annex 2.3 > Language
The use of ISO639-2,
which presents languages in alpha –3 language codes, is the recommended
controlled vocabulary.
An alpha-2
language-coding standard also exists – ISO639-1. Should any system make use of this system, it
is recommended that it provide facilities that can map its entries to the
ISO639-2 standard to ensure interoperability with other OCHA systems.
Due to the length of
the content, both the ISO639-1 and ISO639-2 have been captured in a separate
document entitled “Language
Vocabulary - Metadata Standards – OCHA.xls”.
Annex 2.4 > Type
The Type metadata
element is used to describe the object’s nature. The controlled lists to be used are:
Object Type:
(See Dublin Core for
details: http://dublincore.org/documents/dcmi-type-vocabulary/)
Dataset
Event
Image
Interactive
Resource
Moving
Image
Physical
Object
Service
Software
Sound
Text
Object Contents
Need to get
a list from the Registry group & input from others
Appeal
Letter
Memo
Annex 2.5 > Format (Media-type)
A general list of the
most commonly used within the humanitarian community was put together by the
FIS group. A couple small additions were
also included.
Spreadsheet /
Database |
Text / Document |
||
DBF |
dBase file
structure |
DOC |
Microsoft Word |
XLS |
Microsoft Excel |
EPS |
Encapsulated
Postscript Version |
MDB |
Microsoft Access |
PDF |
Portable Document
Format (Adobe) |
INFO |
|
PPT |
Microsoft Power
Point |
RDBMS |
Specific (ie
SQL-Server, Oracle), etc |
PS |
PostScript Language
Reference (Adobe) |
|
|
RTF |
Rich Text Format |
|
|
TXT |
ASCII text file |
|
|
|
|
Graphics Files |
Internet Related |
||
BMP |
Bitmap |
ASP |
Active Server Page |
CGM |
Computer Graphics
Metafile |
HTML |
HyperText Markup
Language |
GIF |
Graphics
Interchange Format |
XML |
Extensible Markup
Language |
JPEG |
JPEG Compression
Standard |
|
|
PIC |
Lotus PIC Format |
|
|
PNG |
Portable Network
Graphics Specification |
|
|
TIFF |
Tagged Image File
Format |
|
|
|
|
|
|
GIS Related |
Archive Files |
||
ADF |
Arc/Info Grid
Format |
ARJ |
Archive / Compression
Format |
DAT |
Mapinfo Native Data
Format |
RAR |
WinRAR Archive /
Compression Format |
DGN |
Intergraph Standard
File |
ZIP |
WinZip Archive /
Compression Format |
DTED |
Digital Terrain
Evaluation Data (NIMA) |
|
|
DWG |
AutoCAD Drawing
File |
Sound and Video |
|
DXF |
AutoCAD Data
Exchange Format |
AVI |
Audio / Video
Interleaved file format |
E00 |
Standard Exchange
Format for Arch/Info |
MOV |
Movie File |
IMG |
Erdas Imagine
Format |
MPEG/MP3 |
Moving Picture Experts
Group |
MIF |
MapInfo Interchange
Format |
RAM |
Real Audio Format |
MrSID |
Multi Resolution
Seamless Image Data |
SWF |
Macromedia Flash
File Movie Format |
SHP |
ESRI’s Arc View
ShapeFile |
WAV |
WAV Sound File
Format |
STDS |
Spactial Data
Transfer Standards |
|
|
TIFF |
GeoTIFF Format
Specification |
|
|
Annex 2.6 > Status
A
controlled vocabulary for Status is highly recommended, especially should OCHA adopt
any workflow-oriented system(s). Below
is an example set and should only be used as a reference point – a more
finalized list must be developed that is in agreement with all OCHA groups.
Annex 2.7 > Coverage
Three refined element
of Coverage will use a controlled vocabulary.
They are:
1.
Continent: is based on the UN
Population and Statistics Division
2.
Region: is also based on the UN Population and Statistics Division
3.
Country: is defined according to the ISO-3166-1
standard. This standard uses an alpha-3
coding scheme to define each country.
Given the expected
high use of these elements, a separate document (Coverage Vocabulary - Metadata
Standards - OCHA) has been prepared.
Note: There is also a
newer country standard that has emerged in relation to the Internet called the
country code Top-Level Domain Identifiers (ccTLDs) – it uses an alpha-2 coding
scheme to define each country. Although
this system is feasible, it is recommended that the alpha-3 system be used as
it is foreseen to be easier to understand for the regular user.
Therefore any system
using the alpha-2 coding system should make provisions for exporting or mapping
its data to any other OCHA system that uses the ISO-3166-1 standard.
Annex 2.8 > Priority
A controlled
vocabulary for Priority should be defined according to OCHA’s work
processes/procedures. An example set
would be:
Low
Normal
Important
High
Urgent
Annex 2.9 > Disposal
> For Disposal Date and the Review Date, please see Annex 2.2 > Date(s) for guidelines.
>
Disposal Authorized By should:
-
Specify a person / unit that is permitted to dispose
of a given object.
-
Be a user / unit within a given system (similar idea
as to Windows security)
Annex 2.10 > Rights
>
Access Rights should
-
Be specified by a selection from the system (similar
fashion as Windows security)
-
Specify the person/people who have access to the
object and their given permission
>
Responsible should:
-
Be specified by a selection from the system (similar
fashion as Windows security)
Annex 2.11 > OCHA Classification
The OCHA Registry unit
is currently working on developing a standard OCHA Classification scheme that
can be used within a given system if deemed appropriate.
Again, given the nature and size of this document, it will be placed inside a separate file.
ANNEX 3 > Another View of Dublin Core
Subject metadata – What & Why Subject,
Description, Coverage |
Use metadata – How can it be used: Rights and
Permissions |
|||||||||||
Asset metadata – Who, Where &
When Title, Creator,
Publisher, Contributor, Date, Type, Format, Identifier, Source, Language
|
Relational metadata – Links between and
to: Relation |