How to generate a report for songs you listen to using mpv

# Before we start

This blog is not updated, I made this whole thing into a baz plugin: https://ari.lt/gh/mpvp-report

A day ago I started collecting data about what I listen to on my playlist, and currently it's working out amazing, it's very fun, so I thought to myself, 'why not share it', so here you go

# 1. Set up mpvp alias

mpvp alias is what you will have to use to collect data about your playlist, you can set up another name but code should be around the same

Basically, add this to your ~/.bashrc:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
mpvp_collect() {
    [ ! -f "$HOME/.mpvp" ] && : >"$HOME/.mpvp"

    sleep 2

    while true; do
        sleep 5

        x="$(echo '{ "command": ["get_property", "path"] }' | socat - /tmp/mpvipc)"

        [ ! "$x" ] && break

        if [ "$x" ] && [ "$x" != "$(tail -n 1 "$HOME/.mpvp")" ]; then
            sleep 4

            y="$(echo '{ "command": ["get_property", "path"] }' | socat - /tmp/mpvipc)"
            [ "$x" = "$y" ] && echo "$x" >>"$HOME/.mpvp"
        fi
    done
}

alias mpvp='mpvp_collect & mpv --shuffle --loop-playlist --input-ipc-server=/tmp/mpvipc'

When you use the mpvp alias it'll start the data collector in the background, the IPC will be accessible though /tmp/mpvipc, this will collect all data to ~/.mpvp, listen to some music and ignore it for a bit, also, keep in mind, this code is bad because I'm too lazy to improve it and I made it fast, anyway, you need to install socat for this to work

# 2. Generate data report

Well at this point you can do anything you want with your data, although I made a simple generator for it

So I made use of the data I have and my playlist structure, here's an example entry:

1
{"data":"playlist/girl in red - i'll die anyway. [8MMa35B3HT8].mp3","request_id":0,"error":"success"}

There's an ID there so I add YouTube adding to the generator by default, yours might not have it, but I mean, you can still pretty much use it, just links won't work

# 2.1 The script

I made a python script as my generator:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""MPV playlist song reporter"""

import os
import sys
from html import escape as html_escape
from typing import Any, Dict, List, Tuple
from warnings import filterwarnings as filter_warnings

import ujson  # type: ignore
from css_html_js_minify import html_minify  # type: ignore

SONG_TO_ARTIST: Dict[str, str] = {
    "1985": "bo burnham",
    "apocalypse": "cigarettes after Sex",
    "astronomy": "conan gray",
    "brooklyn baby": "lana del rey",
    "come home to me": "crawlers",
    "daddy issues": "the neighbourhood",
    "feel better": "penelope scott",
    "hornylovesickmess": "girl in red",
    "i wanna be your girlfriend": "girl in red",
    "k.": "cigarettes after Sex",
    "lookalike": "conan gray",
    "lotta true crime": "penelope scott",
    "my man's a hexagon (music video)": "münecat",
    "rät": "penelope scott",
    "sappho": "bushies",
    "serial killer - lana del rey lyrics": "lana del rey",
    "sugar, we're goin down but it's creepier": "kade",
    "sweater weather": "the neighbourhood",
    "talia ⧸⧸ girl in red cover": "girl in red",
    "tv": "bushies",
    "unionize - münecat (music video)": "münecat",
    "watch you sleep.": "girl in red",
    "you used me for my love_girl in red": "girl in red",
}


class UnknownMusicArtistError(Exception):
    """Raised when there is an unknown music artist"""


def sort_dict(d: Dict[str, int]) -> Dict[str, int]:
    return {k: v for k, v in sorted(d.items(), key=lambda item: item[1], reverse=True)}


def fsplit_dels(s: str, *dels: str) -> str:
    for delim in dels:
        s = s.split(delim, maxsplit=1)[0]

    return s.strip()


def get_artist_from_song(song: str) -> str:
    song = song.lower()
    delims: Tuple[str, ...] = (
        "–",
        "-",
        ",",
        "feat.",
        ".",
        "&",
    )

    if song not in SONG_TO_ARTIST and any(d in song for d in delims):
        return fsplit_dels(
            song,
            *delims,
        )
    else:
        if song in SONG_TO_ARTIST:
            return SONG_TO_ARTIST[song].lower()

        raise UnknownMusicArtistError(f"No handled artist for song: {song!r}")


def get_played(data: List[Tuple[str, str]]) -> Dict[str, int]:
    played: Dict[str, int] = {}

    for song, _ in data:
        if song not in played:
            played[song] = 0

        played[song] += 1

    return sort_dict(played)


def get_yt_urls_from_data(data: List[Tuple[str, str]]) -> Dict[str, str]:
    return {song: f"https://ari.lt/yt/watch?v={yt_id}" for song, yt_id in data}


def get_artists_from_played(played: Dict[str, int]) -> Dict[str, List[int]]:
    artists: Dict[str, List[int]] = {}

    for song in played:
        artist = get_artist_from_song(song)

        if artist not in artists:
            artists[artist] = [0, 0]

        artists[artist][0] += 1
        artists[artist][1] += played[song]

    return {
        k: v
        for k, v in sorted(artists.items(), key=lambda item: sum(item[1]), reverse=True)
    }


def parse_song(song: str) -> Tuple[str, str]:
    basename: str = os.path.splitext(os.path.basename(song))[0]
    return basename[:-14], basename[-12:-1]


def parse_data(data: List[Tuple[str, str]]) -> Dict[str, Any]:
    played: Dict[str, int] = get_played(data)

    return {
        "total": len(data),
        "played": played,
        "artists": get_artists_from_played(played),
        "yt-urls": get_yt_urls_from_data(data),
    }


def generate_html_report(data: Dict[str, Any]) -> str:
    styles: str = """
@import url("https://cdn.jsdelivr.net/npm/hack-font@3/build/web/hack.min.css");

:root {
    color-scheme: dark;

    --clr-bg: #262220;
    --clr-fg: #f9f6e8;

    --clr-code-bg: #1f1b1a;
    --clr-code-fg: #f0f3e6;
    --clr-code-bg-dark: #181414;

    --scrollbar-height: 6px; /* TODO: Firefox */
}

*,
*::before,
*::after {
    background-color: var(--clr-bg);
    color: var(--clr-fg);
    font-family: Hack, hack, monospace;

    scrollbar-width: none;
    -ms-overflow-style: none;

    scrollbar-color: var(--clr-code-bg-dark) transparent;

    -webkit-box-sizing: border-box;
    box-sizing: border-box;

    word-wrap: break-word;

    scroll-behavior: smooth;
}

::-webkit-scrollbar,
::-webkit-scrollbar-thumb {
    height: var(--scrollbar-height);
}

::-webkit-scrollbar {
    background-color: transparent;
}

::-webkit-scrollbar-thumb {
    background-color: var(--clr-code-bg-dark);
}

html::-webkit-scrollbar,
body::-webkit-scrollbar {
    display: none !important;
}

body {
    margin: auto;
    padding: 2rem;
    max-width: 1100px;
    min-height: 100vh;
    text-rendering: optimizeSpeed;
}

h1 {
    text-align: center;
    margin: 1em;
    font-size: 2em;
}

li {
    margin: 0.5em;
}

a {
    text-decoration: none;
    text-shadow: 0px 0px 4px white;
}

pre,
pre * {
    background-color: var(--clr-code-bg);
}

pre,
pre *,
code {
    color: var(--clr-code-fg);
}

pre,
pre code {
    overflow-x: auto !important;

    scrollbar-width: initial;
    -ms-overflow-style: initial;
}

pre {
    padding: 1em;
    border-radius: 4px;
}

code:not(pre code) {
    background-color: var(--clr-code-bg);
    border-radius: 2px;
    padding: 0.2em;
}

@media (prefers-reduced-motion: reduce) {
    *,
    *::before,
    *::after {
        -webkit-animation-duration: 0.01ms !important;
        animation-duration: 0.01ms !important;

        -webkit-animation-iteration-count: 1 !important;
        animation-iteration-count: 1 !important;

        -webkit-transition-duration: 0.01ms !important;
        -o-transition-duration: 0.01ms !important;
        transition-duration: 0.01ms !important;

        scroll-behavior: auto !important;
    }
}

@media (prefers-contrast: more) {
    :root {
        --clr-bg: black;
        --clr-fg: white;

        --clr-code-bg: #181818;
        --clr-code-fg: whitesmoke;

        --scrollbar-height: 12px; /* TODO: Firefox */
    }

    html::-webkit-scrollbar {
        display: initial !important;
    }

    *,
    *::before,
    *::after {
        scrollbar-width: initial !important;
        -ms-overflow-style: initial !important;
    }

    a {
        text-shadow: none !important;

        -webkit-text-decoration: underline dotted !important;
        text-decoration: underline dotted !important;
    }
}
"""

    songs = artists = ""

    for song, times in data["played"].items():
        songs += f"<li><a href=\"{data['yt-urls'][song]}\">{html_escape(song)}</a> (played <code>{times}</code> time{'s' if times > 1 else ''})</li>"

    for artist, songn in data["artists"].items():
        rps: str = f" (<code>{songn[1]}</code> repeats)"
        artists += f"<li>{html_escape(artist)} (<code>{songn[0]}</code> song{'s' if songn[0] > 1 else ''} \
played{rps if songn[1] > 1 else ''})</li>"

    return html_minify(
        f"""<!DOCTYPE html>
<html lang="en">
    <head>
        <meta charset="UTF-8">
        <meta http-equiv="X-UA-Compatible" content="IE=edge">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
        <title>HTML mpv song report</title>

        <meta name="description" content="What do you listen to" />
        <meta
            name="keywords"
            content="sort, report, music, music report, listen, song"
        />
        <meta
            name="robots"
            content="follow, index, max-snippet:-1, max-video-preview:-1, max-image-preview:large"
        />
        <meta property="og:type" content="website" />
        <meta name="color-scheme" content="dark" />

        <style>{styles}</style>
    </head>

    <body>
        <main>
            <h1>What are you listening to?</h1>

            <hr />

            <h2>Stats</h2>

            <ul>
                <li>Songs played: <code>{data['total']}</code></li>
                <li>Unique songs played: <code>{len(data['played'])}</code></li>
                <li>Artists: <code>{len(data['artists'])}</code></li>
            </ul>

            <h2>Top stats</h2>

            <ul>
                <li>Top artist: <code>{tuple(data['artists'].keys())[0]}</code> with <code>{tuple(data['artists'].values())[0][0]}</code> songs played and \
<code>{tuple(data['artists'].values())[0][1]}</code> repeats</li>
                <li>Top song: <code>{tuple(data['played'].keys())[0]}</code> by <code>{get_artist_from_song(tuple(data['played'].keys())[0])}</code> \
with <code>{tuple(data['played'].values())[0]}</code> plays</li>
            </ul>

            <h2>Songs</h1>

            <details>
                <summary>Expand for the list of songs</summary>
                <ul>{songs}</ul>
            </details>

            <h2>Artists</h2>

            <details>
                <summary>Expand for the list of artists</summary>
                <ul>{artists}</ul>
            </details>

            <h2>Raw JSON data</h2>

            <details>
                <summary>Expand for the raw data</summary>
                <pre><code>{ujson.dumps(data, indent=4)}</code></pre>
            </details>
        </main>
    </body>
</html>"""
    )


def main() -> int:
    """Entry/main function"""

    data: List[Tuple[str, str]] = []

    with open(os.path.expanduser("~/.mpvp"), "r") as mpv_data:
        for line in mpv_data:
            if '"data"' not in line:
                continue

            data.append(parse_song(ujson.loads(line)["data"]))

    with open("index.html", "w") as h:
        h.write(generate_html_report(parse_data(data)))

    return 0


if __name__ == "__main__":
    assert main.__annotations__.get("return") is int, "main() should return an integer"

    filter_warnings("error", category=Warning)
    sys.exit(main())

This is a pretty easy thing, very stupid and not fool-proof but eh, this generator should work out of the box with the song name format being artist name - song, if it's not make sure to add a lowercase entry to SONG_TO_ARTIST, like if your song was named like naMe - Artist you will have to add this entry:

1
"name - artist": "artist",

These settings that you see in my script are for my playlist

# 2.2 Dependencies

Here's the python dependencies you need:

1
2
css-html-js-minify
ujson

You need to install them using

1
python3 -m pip install --user css-html-js-minify ujson

# 2.3 The data report

Once you have enough data to make a report from, run the script, just

1
python3 main.py

Or whatever, it'll generate index.html file and it'll include all of your report data, you can also style it using the styles variable

# 3. Finishing

That's all, enjoy your statistics, and as of now I shall go collect more data, I already have 18KB of it!

Plus, I'll admit it, most of this code is garbage, complete dog shit, I just wanted to make it work and I did, it's readable enough for just a messy script I'm not even releasing as anything legit