Take the community feedback survey now.

Johan Björnfot
Nov 15, 2016
  1460
(0 votes)

Internationalized Resource Identifiers (IRIs)

An Internationalized Resource Identifier (IRI) is a network address that contain non ASCII characters as below:

Image IRI.png

EPiServer CMS has previously (prior to 10) only allowed characters in url segments according to RFC 1738 which basically allows ALPHA / DIGIT / '-'/ '_'/ '~' / '.'/ '$'/. 

It is now (from CMS.Core version 10.1.0) however possible to define a custom character set that are used for url segments and simple address. This is done by registering an instance of UrlSegmentOptions with a custom regular expression in IOC container. When an expression is set that allows characters outside RFC 1738 the setting UrlSegementOptions.Encode is recommended to be set to true so that url:s gets properly encoded. Below is an example of how a character set that allows unicode characters in the letter category.

using EPiServer.ServiceLocation;
using EPiServer.Framework.Initialization;
using EPiServer.Framework;
using EPiServer.Web;

namespace EPiServerSite6
{
    [ModuleDependency(typeof(EPiServer.Web.InitializationModule))]
    public class IRIConfigurationModule : IConfigurableModule
    {
        public void ConfigureContainer(ServiceConfigurationContext context)
        {
            context.Services.RemoveAll<UrlSegmentOptions>();
            context.Services.AddSingleton<UrlSegmentOptions>(s => new UrlSegmentOptions
            {
                Encode = true,
                ValidUrlCharacters = @"\p{L}0-9\-_~\.\$"
            });
        }

        public void Initialize(InitializationEngine context)
        {}

        public void Uninitialize(InitializationEngine context)
        {}
    }
}

UrlSegmentOptions also exposes a CharacterMap property where it is possible to define a mapping for unsupported characters, for example 'ö' => 'o'. 

Internationalized Domain Names (IDN)

As explained in IDN and IRI are internationalized domain names registered in its punycode format (a way of representing Unicode characters using only ASCII characters). 

Internationalized domain names should be registered in admin mode under Manage Websites in their punycode format. 

Nov 15, 2016

Comments

Please login to comment.
Latest blogs
A day in the life of an Optimizely OMVP - Opticon London 2025

This installment of a day in the life of an Optimizely OMVP gives an in-depth coverage of my trip down to London to attend Opticon London 2025 held...

Graham Carr | Oct 2, 2025

Optimizely Web Experimentation Using Real-Time Segments: A Step-by-Step Guide

  Introduction Personalization has become de facto standard for any digital channel to improve the user's engagement KPI’s.  Personalization uses...

Ratish | Oct 1, 2025 |

Trigger DXP Warmup Locally to Catch Bugs & Performance Issues Early

Here’s our documentation on warmup in DXP : 🔗 https://docs.developers.optimizely.com/digital-experience-platform/docs/warming-up-sites What I didn...

dada | Sep 29, 2025

Creating Opal Tools for Stott Robots Handler

This summer, the Netcel Development team and I took part in Optimizely’s Opal Hackathon. The challenge from Optimizely was to extend Opal’s abiliti...

Mark Stott | Sep 28, 2025

Integrating Commerce Search v3 (Vertex AI) with Optimizely Configured Commerce

Introduction This blog provides a technical guide for integrating Commerce Search v3, which leverages Google Cloud's Vertex AI Search, into an...

Vaibhav | Sep 27, 2025

A day in the life of an Optimizely MVP - Opti Graph Extensions add-on v1.0.0 released

I am pleased to announce that the official v1.0.0 of the Opti Graph Extensions add-on has now been released and is generally available. Refer to my...

Graham Carr | Sep 25, 2025