Blog - ari.lt - How to generate a report for songs you listen to using mpv

# Before we start

This blog is not updated, I made this whole thing into a baz plugin: https://ari.lt/gh/mpvp-report

A day ago I started collecting data about what I listen to on my playlist, and currently it's working out amazing, it's very fun, so I thought to myself, 'why not share it', so here you go

# 1. Set up `mpvp` alias

mpvp alias is what you will have to use to collect data about your playlist, you can set up another name but code should be around the same

Basically, add this to your ~/.bashrc:

mpvp_collect() {
    [ ! -f "$HOME/.mpvp" ] && : >"$HOME/.mpvp"

    sleep 2

    while true; do
        sleep 5

        x="$(echo '{ "command": ["get_property", "path"] }' | socat - /tmp/mpvipc)"

        [ ! "$x" ] && break

        if [ "$x" ] && [ "$x" != "$(tail -n 1 "$HOME/.mpvp")" ]; then
            sleep 4

            y="$(echo '{ "command": ["get_property", "path"] }' | socat - /tmp/mpvipc)"
            [ "$x" = "$y" ] && echo "$x" >>"$HOME/.mpvp"
        fi
    done
}

alias mpvp='mpvp_collect & mpv --shuffle --loop-playlist --input-ipc-server=/tmp/mpvipc'

When you use the mpvp alias it'll start the data collector in the background, the IPC will be accessible though /tmp/mpvipc, this will collect all data to ~/.mpvp, listen to some music and ignore it for a bit, also, keep in mind, this code is bad because I'm too lazy to improve it and I made it fast, anyway, you need to install socat for this to work

# 2. Generate data report

Well at this point you can do anything you want with your data, although I made a simple generator for it

So I made use of the data I have and my playlist structure, here's an example entry:

{"data":"playlist/girl in red - i'll die anyway. [8MMa35B3HT8].mp3","request_id":0,"error":"success"}

There's an ID there so I add YouTube adding to the generator by default, yours might not have it, but I mean, you can still pretty much use it, just links won't work

# 2.1 The script

I made a python script as my generator:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""MPV playlist song reporter"""

import os
import sys
from html import escape as html_escape
from typing import Any, Dict, List, Tuple
from warnings import filterwarnings as filter_warnings

import ujson  # type: ignore
from css_html_js_minify import html_minify  # type: ignore

SONG_TO_ARTIST: Dict[str, str] = {
    "1985": "bo burnham",
    "apocalypse": "cigarettes after Sex",
    "astronomy": "conan gray",
    "brooklyn baby": "lana del rey",
    "come home to me": "crawlers",
    "daddy issues": "the neighbourhood",
    "feel better": "penelope scott",
    "hornylovesickmess": "girl in red",
    "i wanna be your girlfriend": "girl in red",
    "k.": "cigarettes after Sex",
    "lookalike": "conan gray",
    "lotta true crime": "penelope scott",
    "my man's a hexagon (music video)": "münecat",
    "rät": "penelope scott",
    "sappho": "bushies",
    "serial killer - lana del rey lyrics": "lana del rey",
    "sugar, we're goin down but it's creepier": "kade",
    "sweater weather": "the neighbourhood",
    "talia ⧸⧸ girl in red cover": "girl in red",
    "tv": "bushies",
    "unionize - münecat (music video)": "münecat",
    "watch you sleep.": "girl in red",
    "you used me for my love_girl in red": "girl in red",
}


class UnknownMusicArtistError(Exception):
    """Raised when there is an unknown music artist"""


def sort_dict(d: Dict[str, int]) -> Dict[str, int]:
    return {k: v for k, v in sorted(d.items(), key=lambda item: item[1], reverse=True)}


def fsplit_dels(s: str, *dels: str) -> str:
    for delim in dels:
        s = s.split(delim, maxsplit=1)[0]

    return s.strip()


def get_artist_from_song(song: str) -> str:
    song = song.lower()
    delims: Tuple[str, ...] = (
        "–",
        "-",
        ",",
        "feat.",
        ".",
        "&",
    )

    if song not in SONG_TO_ARTIST and any(d in song for d in delims):
        return fsplit_dels(
            song,
            *delims,
        )
    else:
        if song in SONG_TO_ARTIST:
            return SONG_TO_ARTIST[song].lower()

        raise UnknownMusicArtistError(f"No handled artist for song: {song!r}")


def get_played(data: List[Tuple[str, str]]) -> Dict[str, int]:
    played: Dict[str, int] = {}

    for song, _ in data:
        if song not in played:
            played[song] = 0

        played[song] += 1

    return sort_dict(played)


def get_yt_urls_from_data(data: List[Tuple[str, str]]) -> Dict[str, str]:
    return {song: f"https://ari.lt/yt/watch?v={yt_id}" for song, yt_id in data}


def get_artists_from_played(played: Dict[str, int]) -> Dict[str, List[int]]:
    artists: Dict[str, List[int]] = {}

    for song in played:
        artist = get_artist_from_song(song)

        if artist not in artists:
            artists[artist] = [0, 0]

        artists[artist][0] += 1
        artists[artist][1] += played[song]

    return {
        k: v
        for k, v in sorted(artists.items(), key=lambda item: sum(item[1]), reverse=True)
    }


def parse_song(song: str) -> Tuple[str, str]:
    basename: str = os.path.splitext(os.path.basename(song))[0]
    return basename[:-14], basename[-12:-1]


def parse_data(data: List[Tuple[str, str]]) -> Dict[str, Any]:
    played: Dict[str, int] = get_played(data)

    return {
        "total": len(data),
        "played": played,
        "artists": get_artists_from_played(played),
        "yt-urls": get_yt_urls_from_data(data),
    }


def generate_html_report(data: Dict[str, Any]) -> str:
    styles: str = """
@import url("https://cdn.jsdelivr.net/npm/hack-font@3/build/web/hack.min.css");

:root {
    color-scheme: dark;

    --clr-bg: #262220;
    --clr-fg: #f9f6e8;

    --clr-code-bg: #1f1b1a;
    --clr-code-fg: #f0f3e6;
    --clr-code-bg-dark: #181414;

    --scrollbar-height: 6px; /* TODO: Firefox */
}

*,
*::before,
*::after {
    background-color: var(--clr-bg);
    color: var(--clr-fg);
    font-family: Hack, hack, monospace;

    scrollbar-width: none;
    -ms-overflow-style: none;

    scrollbar-color: var(--clr-code-bg-dark) transparent;

    -webkit-box-sizing: border-box;
    box-sizing: border-box;

    word-wrap: break-word;

    scroll-behavior: smooth;
}

::-webkit-scrollbar,
::-webkit-scrollbar-thumb {
    height: var(--scrollbar-height);
}

::-webkit-scrollbar {
    background-color: transparent;
}

::-webkit-scrollbar-thumb {
    background-color: var(--clr-code-bg-dark);
}

html::-webkit-scrollbar,
body::-webkit-scrollbar {
    display: none !important;
}

body {
    margin: auto;
    padding: 2rem;
    max-width: 1100px;
    min-height: 100vh;
    text-rendering: optimizeSpeed;
}

h1 {
    text-align: center;
    margin: 1em;
    font-size: 2em;
}

li {
    margin: 0.5em;
}

a {
    text-decoration: none;
    text-shadow: 0px 0px 4px white;
}

pre,
pre * {
    background-color: var(--clr-code-bg);
}

pre,
pre *,
code {
    color: var(--clr-code-fg);
}

pre,
pre code {
    overflow-x: auto !important;

    scrollbar-width: initial;
    -ms-overflow-style: initial;
}

pre {
    padding: 1em;
    border-radius: 4px;
}

code:not(pre code) {
    background-color: var(--clr-code-bg);
    border-radius: 2px;
    padding: 0.2em;
}

@media (prefers-reduced-motion: reduce) {
    *,
    *::before,
    *::after {
        -webkit-animation-duration: 0.01ms !important;
        animation-duration: 0.01ms !important;

        -webkit-animation-iteration-count: 1 !important;
        animation-iteration-count: 1 !important;

        -webkit-transition-duration: 0.01ms !important;
        -o-transition-duration: 0.01ms !important;
        transition-duration: 0.01ms !important;

        scroll-behavior: auto !important;
    }
}

@media (prefers-contrast: more) {
    :root {
        --clr-bg: black;
        --clr-fg: white;

        --clr-code-bg: #181818;
        --clr-code-fg: whitesmoke;

        --scrollbar-height: 12px; /* TODO: Firefox */
    }

    html::-webkit-scrollbar {
        display: initial !important;
    }

    *,
    *::before,
    *::after {
        scrollbar-width: initial !important;
        -ms-overflow-style: initial !important;
    }

    a {
        text-shadow: none !important;

        -webkit-text-decoration: underline dotted !important;
        text-decoration: underline dotted !important;
    }
}
"""

    songs = artists = ""

    for song, times in data["played"].items():
        songs += f"<li><a href=\"{data['yt-urls'][song]}\">{html_escape(song)}</a> (played <code>{times}</code> time{'s' if times > 1 else ''})</li>"

    for artist, songn in data["artists"].items():
        rps: str = f" (<code>{songn[1]}</code> repeats)"
        artists += f"<li>{html_escape(artist)} (<code>{songn[0]}</code> song{'s' if songn[0] > 1 else ''} \
played{rps if songn[1] > 1 else ''})</li>"

    return html_minify(
        f"""<!DOCTYPE html>
<html lang="en">
    <head>
        <meta charset="UTF-8">
        <meta http-equiv="X-UA-Compatible" content="IE=edge">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
        <title>HTML mpv song report</title>

        <meta name="description" content="What do you listen to" />
        <meta
            name="keywords"
            content="sort, report, music, music report, listen, song"
        />
        <meta
            name="robots"
            content="follow, index, max-snippet:-1, max-video-preview:-1, max-image-preview:large"
        />
        <meta property="og:type" content="website" />
        <meta name="color-scheme" content="dark" />

        <style>{styles}</style>
    </head>

    <body>
        <main>
            <h1>What are you listening to?</h1>

            <hr />

            <h2>Stats</h2>

            <ul>
                <li>Songs played: <code>{data['total']}</code></li>
                <li>Unique songs played: <code>{len(data['played'])}</code></li>
                <li>Artists: <code>{len(data['artists'])}</code></li>
            </ul>

            <h2>Top stats</h2>

            <ul>
                <li>Top artist: <code>{tuple(data['artists'].keys())[0]}</code> with <code>{tuple(data['artists'].values())[0][0]}</code> songs played and \
<code>{tuple(data['artists'].values())[0][1]}</code> repeats</li>
                <li>Top song: <code>{tuple(data['played'].keys())[0]}</code> by <code>{get_artist_from_song(tuple(data['played'].keys())[0])}</code> \
with <code>{tuple(data['played'].values())[0]}</code> plays</li>
            </ul>

            <h2>Songs</h1>

            <details>
                <summary>Expand for the list of songs</summary>
                <ul>{songs}</ul>
            </details>

            <h2>Artists</h2>

            <details>
                <summary>Expand for the list of artists</summary>
                <ul>{artists}</ul>
            </details>

            <h2>Raw JSON data</h2>

            <details>
                <summary>Expand for the raw data</summary>
                <pre><code>{ujson.dumps(data, indent=4)}</code></pre>
            </details>
        </main>
    </body>
</html>"""
    )


def main() -> int:
    """Entry/main function"""

    data: List[Tuple[str, str]] = []

    with open(os.path.expanduser("~/.mpvp"), "r") as mpv_data:
        for line in mpv_data:
            if '"data"' not in line:
                continue

            data.append(parse_song(ujson.loads(line)["data"]))

    with open("index.html", "w") as h:
        h.write(generate_html_report(parse_data(data)))

    return 0


if __name__ == "__main__":
    assert main.__annotations__.get("return") is int, "main() should return an integer"

    filter_warnings("error", category=Warning)
    sys.exit(main())

This is a pretty easy thing, very stupid and not fool-proof but eh, this generator should work out of the box with the song name format being artist name - song, if it's not make sure to add a lowercase entry to SONG_TO_ARTIST, like if your song was named like naMe - Artist you will have to add this entry:

1	"name - artist": "artist",

These settings that you see in my script are for my playlist

# 2.2 Dependencies

Here's the python dependencies you need:

css-html-js-minify
ujson

You need to install them using

1	python3 -m pip install --user css-html-js-minify ujson

# 2.3 The data report

Once you have enough data to make a report from, run the script, just

1	python3 main.py

Or whatever, it'll generate index.html file and it'll include all of your report data, you can also style it using the styles variable

# 3. Finishing

That's all, enjoy your statistics, and as of now I shall go collect more data, I already have 18KB of it!

Plus, I'll admit it, most of this code is garbage, complete dog shit, I just wanted to make it work and I did, it's readable enough for just a messy script I'm not even releasing as anything legit