# Caret-Separated Text

Caret-Separated Text (or CST) is a key-value pair format represented by numbers as keys and the value is the string enclosed between carets (^) that contains the translation. Any text which is not enclosed with carets is considered a comment and ignored.

## CST.NET

CST.NET uses .NET's built-in indexing extension function to accomplish locating of each respective key. As a consequence, it does not matter what you use for keys. I added an additional normalizion to the pipeline that converts the document's line endings to the system's, in order to prevent crashes.

In [1]:
using System.IO;
using System.Collections.Generic;
using System.Text.RegularExpressions;

In [1]:
public static class CST
{
    const char CARET = '^';
    static readonly string _lf = "\u000A";
    static readonly string _cr = "\u000D";
    static readonly string _crlf = "\u000D\u000A";
    static readonly string _ls = "\u2028";

    /// <summary>
    /// Gets the value from the integer-based key.
    /// </summary>
    /// <returns>Returns the entry</returns>
    public static string Parse(string content, int key)
    {
        var entries = NormalizeEntries(content);
        return GetEntry(entries, key.ToString());
    }

    /// <summary>
    /// Gets the value from the string-based key.
    /// </summary>
    /// <returns>Returns the entry</returns>
    public static string Parse(string content, string key)
    {
        var entries = NormalizeEntries(content);
        return GetEntry(entries, key);
    }

    /// <summary>
    /// Replaces the document's line endings with the native system line endings.
    /// </summary>
    /// <remarks>This stage ensures there are no crashes during parsing.</remarks>
    static IEnumerable<string> NormalizeEntries(string content)
    {

        /* 
        I tried putting the end carets with the different
        line endings in with the split function but it didn't work 
        */
        if (!content.Contains($"{CARET}{Environment.NewLine}"))
        {
            if (content.Contains($"{CARET}{_lf}"))
                content = content.Replace($"{CARET}{_lf}",
                $"{CARET}{Environment.NewLine}");

            if (content.Contains($"{CARET}{_cr}"))
                content = content.Replace($"{CARET}{_cr}",
                $"{CARET}{Environment.NewLine}");

            if (content.Contains($"{CARET}{_crlf}"))
                content = content.Replace($"{CARET}{_crlf}",
                $"{CARET}{Environment.NewLine}");

            if (content.Contains($"{CARET}{_ls}"))
                content = content.Replace($"{CARET}{_ls}",
                $"{CARET}{Environment.NewLine}");
        }


        var entries = content.Split(new[] { $"{CARET}{Environment.NewLine}" },
            StringSplitOptions.RemoveEmptyEntries);
        var newContent = new List<string>();

        foreach (var entry in entries)
        {
            // Skip comments
            if (entry.StartsWith(@"//") || entry.StartsWith("#") ||
                entry.StartsWith("/*") || entry.EndsWith("*/"))
                continue;

            newContent.Add(entry);
        }

        return newContent;
    }

    static string GetEntry(IEnumerable<string> entries, string key)
    {
        // Search through list
        foreach (var entry in entries)
        {
            // Locate index, trim carets and return translation
            if (!entry.StartsWith(key))
                continue;

            var startIndex = entry.IndexOf(CARET);
            var line = entry.Substring(startIndex);

            if (!line.Contains(Environment.NewLine))
            {
                if (line.Contains(_lf))
                    line = line.Replace(_lf, Environment.NewLine);

                if (line.Contains(_cr))
                    line = line.Replace(_cr, Environment.NewLine);

                if (line.Contains(_crlf))
                    line = line.Replace(_crlf, Environment.NewLine);

                if (line.Contains(_ls))
                    line = line.Replace(_ls, Environment.NewLine);
            }

            return line.TrimStart(CARET).TrimEnd(CARET);
        }

        return "[ENTRY NOT FOUND]";
    }
}

In [1]:
var v1Path = Path.Combine(Environment.CurrentDirectory, "data", "v1.cst");
var v1File = File.ReadAllText(v1Path);
var one = CST.Parse(v1File, 1);
var three = CST.Parse(v1File, 3);
var four = CST.Parse(v1File, 4);
Console.WriteLine($"One:{Environment.NewLine}{one}");
Console.WriteLine($"Three:{Environment.NewLine}{three}");
Console.WriteLine($"Four:{Environment.NewLine}{four}");

One:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Proin ac dictum orci, at tincidunt nulla. Donec aliquet, %1 eros non interdum posuere, ipsum sapien molestie nunc, nec facilisis libero ipsum et risus. In sed lorem vel ipsum placerat viverra.


Three:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aliquam venenatis ac odio ut pretium. Interdum et malesuada fames ac ante ipsum primis in faucibus. Donec semper turpis tempor, bibendum sapien at, blandit neque. Vivamus hendrerit imperdiet elit, vel sollicitudin nulla luctus vel. Vivamus nisl quam, feugiat a diam aliquam, iaculis vestibulum nunc. Maecenas euismod leo enim, faucibus ultrices ipsum semper eu. Praesent ullamcorper justo at maximus ultricies.


Four:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce justo dui, rhoncus a pulvinar sit amet, fermentum vitae lorem. Maecenas nec nisi sit amet eros rutrum congue. In sagittis suscipit arcu, ac vestibulum nunc feugiat volutpat.

Vivamus consequat velit dui, sit amet rhoncus dui malesuada a. Maecenas hendrerit commodo mi et scelerisque. Cras pharetra ultrices aliquam. Praesent ac efficitur magna, vitae scelerisque metus.


In [1]:
var v2Path = Path.Combine(Environment.CurrentDirectory, "data", "v2.cst");
var v2File = File.ReadAllText(v2Path);
var singleLineV2 = CST.Parse(v2File, "Singleline");
var multiLineV2 = CST.Parse(v2File, "Multiline");
Console.WriteLine($"Single line v2:{Environment.NewLine}{singleLineV2}");
Console.WriteLine($"Multiline v2:{Environment.NewLine}{multiLineV2}");

Single line v2:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed ultricies nulla eu tortor mattis, dictum posuere lacus ornare. Maecenas a massa in ligula finibus luctus eu vitae nibh. Proin imperdiet dapibus mauris quis placerat.


Multiline v2:
[ENTRY NOT FOUND]
