Economies covered

  • 2009-2010 Edition dr_dot2009-2010
  • 2007-2008 Edition dr_dot2007-2008
  • 2005-2006 Edition dr_dot2005-2006
  • 2003-2004 Edition dr_dot2003-2004

Click the dot to read the chapters. 

.af Afghanistan dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.au Australia dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.bd Bangladesh dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.bn Brunei Darussalam dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.bt Bhutan dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.cn China dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.hk Hong Kong dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.id Indonesia dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.in India dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.ir Iran dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006
.jp Japan dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.kh Cambodia dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.kp North Korea dr_dot2009-2010 dr_dot2007-2008

.kr South Korea
dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.la Lao PDR
dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.lk Sri Lanka
dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.mm Myanmar
dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.mn Mongolia
dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.mo Macau
dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.mv Maldives
dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006
.my Malaysia
dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.np Nepal
dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.nz New Zealand
dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.ph Philippines
dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.pk Pakistan
dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.sg Singapore
dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.th Thaïland
dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.tl / .tp Timor-Leste
dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.tw Taiwan
dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
.vn Vietnam
dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006 dr_dot2003-2004
SAARC dr_dot2009-2010 dr_dot2007-2008
ASEAN
dr_dot2009-2010 dr_dot2007-2008 dr_dot2005-2006
APEC dr_dot2009-2010
dr_dot2005-2006

Localization in Asia Pacific

Article Index
Localization in Asia Pacific
The process of localization
Regional and international organizations
Status of language technology
Policy considerations for localization in Asia Pacific
Concluding remarks
Notes
References

Policy considerations for localization in Asia Pacific

The goal of localization is to enable communities to share and exchange information through ICTs. Achieving this goal would require planning and executing a strategy that can address the entire spectrum of associated issues. This section presents the considerations and recommendations for national, regional and international organizations to plan the development of language technology, especially in the context of Asia Pacific.

Majority vs. minority languages

National localization planning must strike a balance between the requirements of the majority and the minority. If the policy prioritizes localization based on the speaking population alone, minority languages may not be addressed. More rigorous criteria based on additional demographic and social factors need to be evolved to include minority languages in localization, as these languages present little incentive for commercial interests. Effective planning might even help preserve the linguistic diversity of the region and help protect endangered languages.

Breadth vs. depth of localization

Due to multiple languages spoken in most Asia Pacific countries, resource allocation is a tricky task. Should multiple languages be taken up for basic localization or should fewer languages be taken up for more in-depth advanced application development? If focus remains only on basic localization due to the numerous languages, advanced applications might never be addressed even though it is necessary to provide access to information to a large part of the population in the region. On the other hand, if only advanced applications are considered, only a limited number of languages may be localized because advanced applications take a much longer time to develop.

Again, a complex socio-economic balance must be struck to determine the right formula for each national context.

Human resource training

In most Asia Pacific countries, there is very limited linguistic and technical capacity to develop standards, perform linguistic analysis and create language technology. Training and human resource planning is critical. Depending on the choice of applications and languages, expertise may be required in various branches of linguistics (phonetics, phonology, morphology, syntax, semantics and pragmatics), signal and speech processing, image processing, statistics, computational linguistics and advanced computing. Training for basic localization work could take about six months. To develop advanced applications, experienced linguists and computational linguists are required and dedicated training over many years is necessary. To address national needs and to keep the training process sustainable, diploma and degree programmes in speech, script and language processing should be developed at the universities, through collaboration of the linguistics, computer science and engineering departments. Scholarships dedicated to these areas for study abroad can also help accelerate the process. Regional and international cooperation can play a significant role in these efforts.

The best way to build capacity is to involve the technical development staff in actual hands-on localization work. This can be achieved by national and regional organizations funding language computing projects (see the case study on the PAN Localization Project below). Momentum for localization can also be triggered by governments if they create awareness of local language computing and generate market demand by requiring public information to be localized through e-governance initiatives. Regional organizations can organize national and regional training and seminars. Two recent initiatives are the Summer School in Asian Language Processing in 2006 organized by the PAN Localization project and Asian Applied Natural Language Processing for Linguistics Diversity and Language Resource Development (ADD) organized by the Thai Computational Laboratory.

PAN Localization: A regional initiative to develop local language computing capacity in Asia

The PAN Localization Project is a concrete example of a cohesive regional cooperative project to develop and disseminate local language computing technology in Asia Pacific. In the first phase, from 2004 to 2007, the project focused on developing (a) human resource, (b) technology and (c) policy related to language computing across Asia Pacific. In the second phase, from 2007 to 2010, the project will look into social models for enabling local language content access and generation by training rural communities to use local language computing technology. Thus, the project addresses the immediate need for localization in developing Asia.

The project is a collaboration among 11 countries: Afghanistan, Bangladesh, Bhutan, Cambodia, China, Laos, Mongolia, Nepal, Pakistan, the Philippines and Sri Lanka. It is coordinated by the Center for Research in Urdu Language Processing (CRULP, www.crulp.org) at the National University of Computer and Emerging Sciences (NUCES, www.nu.edu.pk) in Pakistan and funded by the Pan Asia Networking (PAN) programme of the International Development Research Centre (IDRC, www.idrc.ca). The project has also developed formal and informal collaboration with other countries, including India, Iran, Japan, Korea, Myanmar, Indonesia and Thailand.

The project supports a development team of about 100 people across the participating countries who are being trained and who are actively developing local language computing solutions in 15 different Asian languages. The project maintains a team at each collaborating country. The country teams decide the scope of work and the platform to localize based on level of localization and the capacity of the available human resources. Development targets help the teams focus their capacity building efforts. In most cases, the country components are hosted at universities and public sector organizations to ensure sustainability. Sustainability is also addressed by contributing towards the development of formal research groups on localization. The project has already helped establish the Center for Reseach in Bangla Language Computing at BRAC University in Bangladesh, the Research Division at the Department of IT in Bhutan, the Language Technology Research Lab at the University of Colombo School of Computing in Sri Lanka, the Nepali Language Technology Group at the Madan Puraskar Pustakalaya and the Language Technology Lab at the University of Kathmandu in Nepal, the Speech Lab at the Institute of Technology of Cambodia and language and speech technology labs at the National University of Mongolia and Mongolian University of Science and Technology, respectively.

The project has arranged short and long-term national and regional training for its staff. For example, a mentor placement programme has allowed experienced personnel from Pakistan, India and Sri Lanka to be placed in Bhutan, Cambodia and Laos for two to six months. This has been noted as one of the most significant capacity building methods by the partner countries. A two-and-a-half month long Summer School in Asian Language Processing at NUCES, in 2006, addressed training in advanced language computing and helped build capacity in script, speech and language processing for 40 participants from 12 countries. Other training and workshops organized by the project are listed at the project website (see Activities link at www.PANL10n.net). The project has also been training end-users in local language computing applications, for example in Bhutan, Cambodia, Laos, Nepal, the Philippines and Sri Lanka. These have been on multiple platforms—for example, on Open Source platforms in Bhutan, Cambodia, Nepal and Sri Lanka, and on proprietary platforms in Laos, Cambodia and Sri Lanka. These efforts are being extended to all participating countries in the second phase of the project.

In its first phase, the project also developed a variety of local language computing solutions, including Pashto script, keyboard and collation standards; Bangla collation, lexicon, morphological analyzer and OCR; DzongkhaLinux distribution, including Dzongkha fonts, collation, keyboard and localized applications for word processing, e-mailing, Web browsing, chatting and multimedia; Khmer collation, lexicon, word segmentation, spell checker and tagged corpus; Lao fonts, collation, keyboard, lexicon and corpus; Nepali Linux distribution including Nepali collation, keyboard, spell checker and localized applications for word processing, e-mailing, Web browsing, chatting, accounting and multimedia; and Sinhala TTS and OCR, lexicon, collation and corpus (see the project website for a detailed list of current outputs). The project has also developed training materials for these and other applications in the local languages. Open licensing allows these outputs to be shared between the partner countries. For example, the OCR software developed for Sinhala by Sri Lanka has been used by the Laos team to retrain it for Lao.

Equally significant is the development of a network of researchers in the region through the project. Experts, practitioners and policymakers have been brought together to interact and guide development teams in the participating countries. The project has also developed a repository of training materials and links to local language resources. It disseminates research outputs with open software and content licenses. Aside from local language software for nine languages, the outputs include research reports specific to the target languages and general guides, such as the Survey of Local Language Computing in Asia 2005 (Hussain et al. 2005) and A Guide to Linux Localization.

The project is helping research the challenges and solutions for creating localization awareness in the region; building sustainable human resource capacity; developing standards and basic and advanced localization technology; and forming a regional network of researchers. It has institutionalized localization in many of its partner countries and is directly and indirectly influencing relevant ICT policy. Thus, the project is addressing local language computing in a holistic fashion across Asia Pacific.

Partnerships and resource sharing

It is redundant and usually expensive to localize independently for all languages. A better model is to reuse the same basic technology for different languages. Most open source software work on this principle. Innovative mechanisms must be put in place to share content, training and other localization work. Regional and international organizations must play a significant role in this context, funding avenues through which research, training, resources and best practices may be shared across nations. Many such initiatives are developing in the region, such as the AFNLP, International Open Source Network (IOSN), Asia Open Source Software (AOSS) and Asia Commons, which are nongovernmental organizations. Many other technology frameworks are also available and being developed in universities and other organizations across the world.

Licensing regimes

As discussed, many different licensing regimes are possible both for the software and content being produced. As much as possible, open licensing must be adopted to propagate the work in local language computing. Liberal licenses, such as GPL, MIT and BSD, can allow open source distribution of software for non-profit as well as commercial purposes (cf. Chen 2006). Content must also be made available with liberal licensing for convenient access (for example, Creative Commons). In addition, effective channels are needed to share content and training curricula, perhaps using models similar to the Wikipedia and Sourceforge initiatives.

Because effective coordination cannot be achieved only through virtual communities, there is also a need for face-to-face networking. Regional and international organizations dedicated to social development through ICTs need to play an active leadership role in this regard. For example, the Free and Open Source Software in Asia Pacific (FOSSAP) forum by IOSN has been discussing software licensing and Asia Commons has started addressing content licensing.

Computing platforms

A very important aspect of localization is the choice of computing platform. Both proprietary and open source platforms exist and are currently being used. For end-users in Asia Pacific, the prevalent platforms include Microsoft Windows, Java Virtual Machine (JVM or Java) and varieties of Linux (for example, Red Hat and Debian). Windows is a proprietary software platform which is not free and has some security concerns.19 Java is a virtual platform and requires a physical platform like Microsoft Windows or Linux on which it can be installed. Linux is open source and free of cost.20

However, the choice is not as apparent as it seems. Though Windows is proprietary, closed and vulnerable to security threats, it is still the most widely used software with convenient plug-and-play hardware installation features, making it very convenient for end-users. The Linux platform requires more expertise to use and is more difficult to manage and maintain given the limited administrative and management capacity currently available. Deciding which platform to target for localization is a complex issue. For some languages which are already supported by Microsoft products, Windows may present a more viable short-term solution. For these languages, Linux may present a solution in the longer term, as there is a need to train more human resources to maintain Linux-based systems. For other languages that are not currently supported by Windows, open source platforms may be the only solution, as the localization plans of Microsoft may not align with national priorities.

Participatory standardization

With the growing need and demand for multilingual computing, there is increased standardization activity. Owing to the urgency and multiplicity of the tasks, there are very frequent meetings among the participating organizations across the world, as well as public requests for comments on the developing standards. However, due to lack of expertise and resources, it is difficult for many developing countries in Asia Pacific to participate in these discussions. Unfortunately, lack of participation is always considered to be tacit approval by these standards organizations.

From an academic point of view, assuming approval when there is lack of comment is not always the best strategy for the development of standards despite the operational ease of this process. When multilingual standards are finalized without indigenous feedback, there are bound to be problems (for example, as reported for the Khmer Unicode page) especially once many of these languages catch up to the newer standards. The process of standardization must be proactive from both ends. National bodies must try to actively participate in the process and the standards development organizations should have programmes to train participants from different countries and to proactively seek their feedback before proceeding to finalize multilingual standards. This requires significant financial investment which has to be raised in a sustainable way. For example, the Asian Forum for Standardization of Information Technology (AFSIT) and associated programmes by CICC have contributed significantly in the areas of multilingual computing and related standardization training. Such efforts must continue in the future.

Translation of policy into projects

National policy alone will not ensure the development of local language computing. The policy must be translated into action plans, which in turn must be realized into projects with explicit funding allocation. The first step would be to develop a national committee of experts to discuss and finalize basic standards. Once standards are developed, basic localization for a language is possible for as little as USD 200,000 within one to two years. Developing a complete set of advanced applications would require considerably more effort and time—about three to five years to develop functional models and about a decade to mature—even when using existing software toolkits.21 Building a complete suite of language technology for a single language could cost more than USD 5 million.22 Basic localization may be undertaken by the private sector. However, because there are few commercial incentives for advanced applications in developing countries, these would only be developed with explicit support and funding by the government and other organizations.



 

Add comment


Security code
Refresh