
This module contains classes to describe the content of an audio-segment.


class audiomate.annotations.Label(value, start=0, end=inf, meta=None)[source]

Represents a label that describes some part of an utterance.

  • value (str) – The text of the label.
  • start (float) – Start of the label within the utterance in seconds. (default: 0)
  • end (float) – End of the label within the utterance in seconds. (default: inf) (inf defines the end of the utterance)
  • meta (dict) – A dictionary containing additional information for the label.

label_list (LabelList) – The label-list this label is belonging to.

do_overlap(other_label, adjacent=True)[source]

Determine whether other_label overlaps with this label. If adjacent==True, adjacent labels are also considered as overlapping.

  • other_label (Label) – Another label.
  • adjacent (bool) – If True, adjacent labels are considered as overlapping.

True if the two labels overlap, False otherwise.

Return type:



Return the duration of the label in seconds.


Return the absolute end of the label in seconds relative to the signal. If the label isn’t linked to any utterance via label-list, it is assumed self.end is relative to the start of the signal, hence self.end == self.end_abs.


Return the length of the label (Number of characters).


Return the duration of the overlapping part between this label and other_label.

Parameters:other_label (Label) – Another label to check.
Returns:The duration of overlap in seconds.
Return type:float


>>> label_a = Label('a', 3.4, 5.6)
>>> label_b = Label('b', 4.8, 6.2)
>>> label_a.overlap_duration(label_b)

Read the samples of the utterance.

Parameters:sr (int) – If None uses the sampling rate given by the track, otherwise resamples to the given sampling rate.
Returns:A numpy array containing the samples as a floating point (numpy.float32) time series.
Return type:np.ndarray

Return the absolute start of the label in seconds relative to the signal. If the label isn’t linked to any utterance via label-list, it is assumed self.start is relative to the start of the signal, hence self.start == self.start_abs.

tokenized(delimiter=' ')[source]

Return a list with tokens from the value of the label. Tokens are extracted by splitting the string using delimiter and then trimming any whitespace before and after splitted strings.

Parameters:delimiter (str) – The delimiter used to split into tokens. (default: space)
Returns:A list of tokens in the order they occur in the label.
Return type:list


>>> label = Label('as is oh')
>>> label.tokenized()
['as', 'is', 'oh']

Using a different delimiter (whitespace is trimmed anyway):

>>> label = Label('oh hi, as, is  ')
>>> label.tokenized(delimiter=',')
['oh hi', 'as', 'is']


class audiomate.annotations.LabelList(idx='default', labels=None)[source]

Represents a list of labels which describe an utterance. An utterance can have multiple label-lists.

  • idx (str) – An unique identifier for the label-list within a corpus for one utterance.
  • labels (list) – The list containing the audiomate.annotations.Label.
  • utterance (Utterance) – The utterance this label-list is belonging to.
  • label_tree (IntervalTree) – The interval-tree storing the labels.


>>> label_list = LabelList(idx='transcription', labels=[
>>>     Label('this', 0, 2),
>>>     Label('is', 2, 4),
>>>     Label('timmy', 4, 8)
>>> ])

Add a label to the end of the list.

Parameters:label (Label) – The label to add.
addl(value, start=0.0, end=inf)[source]

Shortcut for add(Label(value, start, end)).

all_tokens(delimiter=' ')[source]

Return a list of all tokens occurring in the label-list.

Parameters:delimiter (str) – The delimiter used to split labels into tokens (see audiomate.annotations.Label.tokenized()).
Returns:A set of distinct tokens.
Return type:set

Apply the given function fn to every label in this label list. fn is a function of one argument that receives the current label which can then be edited in place.

Parameters:fn (func) – Function to apply to every label


>>> ll = LabelList(labels=[
...     Label('a_label', 1.0, 2.0),
...     Label('another_label', 2.0, 3.0)
... ])
>>> def shift_labels(label):
...     label.start += 1.0
...     label.end += 1.0
>>> ll.apply(shift_labels)
>>> ll.labels
[Label(a_label, 2.0, 3.0), Label(another_label, 3.0, 4.0)]
classmethod create_single(value, idx='default')[source]

Create a label-list with a single label containing the given value.


Return end of the lastly ending label (upper bound).

join(delimiter=' ', overlap_threshold=0.1)[source]

Return a string with all labels concatenated together. The order of the labels is defined by the start of the label. If the overlapping between two labels is greater than overlap_threshold, an Exception is thrown.

  • delimiter (str) – A string to join two consecutive labels.
  • overlap_threshold (float) – Maximum overlap between two consecutive labels.

A string with all labels concatenated together.

Return type:



>>> ll = LabelList(idx='some', labels=[
>>>     Label('a', start=0, end=4),
>>>     Label('b', start=3.95, end=6.0),
>>>     Label('c', start=7.0, end=10.2),
>>>     Label('d', start=10.3, end=14.0)
>>> ])
>>> ll.join(' - ')
'a - b - c - d'

Return for each label the number of occurrences within the list.

Returns:A dictionary containing for every label-value (key) the number of occurrences (value).
Return type:dict


>>> ll = LabelList(labels=[
>>>     Label('a', 3.2, 4.5),
>>>     Label('b', 5.1, 8.9),
>>>     Label('a', 7.2, 10.5),
>>>     Label('b', 10.5, 14),
>>>     Label('a', 15, 18)
>>> ])
>>> ll.label_count()
{'a': 3 'b': 2}

Return for each distinct label value the total duration of all occurrences.

A dictionary containing for every label-value (key)
the total duration in seconds (value).
Return type:dict


>>> ll = LabelList(labels=[
>>>     Label('a', 3, 5),
>>>     Label('b', 5, 8),
>>>     Label('a', 8, 10),
>>>     Label('b', 10, 14),
>>>     Label('a', 15, 18.5)
>>> ])
>>> ll.label_total_duration()
{'a': 7.5 'b': 7.0}

Return a list of all occuring label values.

Returns:Lexicographically sorted list (str) of label values.
Return type:list


>>> ll = LabelList(labels=[
>>>     Label('a', 3.2, 4.5),
>>>     Label('b', 5.1, 8.9),
>>>     Label('c', 7.2, 10.5),
>>>     Label('d', 10.5, 14),
>>>     Label('d', 15, 18)
>>> ])
>>> ll.label_values()
['a', 'b', 'c', 'd']

Return list of labels.

labels_in_range(start, end, fully_included=False)[source]

Return a list of labels, that are within the given range. Also labels that only overlap are included.

  • start (float) – Start-time in seconds.
  • end (float) – End-time in seconds.
  • fully_included (bool) – If True, only labels fully included in the range are returned. Otherwise also overlapping ones are returned. (default False)

List of labels in the range.

Return type:



>>> ll = LabelList(labels=[
>>>     Label('a', 3.2, 4.5),
>>>     Label('b', 5.1, 8.9),
>>>     Label('c', 7.2, 10.5),
>>>     Label('d', 10.5, 14)
>>> ll.labels_in_range(6.2, 10.1)
[Label('b', 5.1, 8.9), Label('c', 7.2, 10.5)]

Merge overlapping labels with the same value. Two labels are considered overlapping, if l2.start - l1.end < threshold.

Parameters:threshold (float) – Maximal distance between two labels to be considered as overlapping. (default: 0.0)


>>> ll = LabelList(labels=[
...     Label('a_label', 1.0, 2.0),
...     Label('a_label', 1.5, 2.7),
...     Label('b_label', 1.0, 2.0),
... ])
>>> ll.merge_overlapping_labels()
>>> ll.labels
    Label('a_label', 1.0, 2.7),
    Label('b_label', 1.0, 2.0),
ranges(yield_ranges_without_labels=False, include_labels=None)[source]

Generate all ranges of the label-list. A range is defined as a part of the label-list for which the same labels are defined.

  • yield_ranges_without_labels (bool) – If True also yields ranges for which no labels are defined.
  • include_labels (list) – If not empty, only the label values in the list will be considered.

A generator which yields one range (tuple start/end/list-of-labels) at a time.

Return type:



>>> ll = LabelList(labels=[
>>>     Label('a', 3.2, 4.5),
>>>     Label('b', 5.1, 8.9),
>>>     Label('c', 7.2, 10.5),
>>>     Label('d', 10.5, 14)
>>> ranges = ll.ranges()
>>> next(ranges)
(3.2, 4.5, [ < audiomate.annotations.Label at 0x1090527c8 > ])
>>> next(ranges)
(4.5, 5.1, [])
>>> next(ranges)
(5.1, 7.2, [ < audiomate.annotations.label.Label at 0x1090484c8 > ])

Create a separate Label-List for every distinct label-value.

A dictionary with distinct label-values as keys.
Every value is a LabelList containing only labels with the same value.
Return type:dict


>>> ll = LabelList(idx='some', labels=[
>>>     Label('a', start=0, end=4),
>>>     Label('b', start=3.95, end=6.0),
>>>     Label('a', start=7.0, end=10.2),
>>>     Label('b', start=10.3, end=14.0)
>>> ])
>>> s = ll.separate()
>>> s['a'].labels
[Label('a', start=0, end=4), Label('a', start=7.0, end=10.2)]
>>> s['b'].labels
[Label('b', start=3.95, end=6.0), Label('b', start=10.3, end=14.0)]
split(cutting_points, shift_times=False, overlap=0.0)[source]

Split the label-list into x parts and return them as new label-lists. x is defined by the number of cutting-points(x == len(cutting_points) + 1)

The result is a list of label-lists corresponding to each part. Label-list 0 contains labels between 0 and cutting_points[0]. Label-list 1 contains labels between cutting_points[0] and cutting_points[1]. And so on.

  • cutting_points (list) – List of floats defining the points in seconds, where the label-list is splitted.
  • shift_times (bool) – If True, start and end-time are shifted in splitted label-lists. So the start is relative to the cutting point and not to the beginning of the original label-list.
  • overlap (float) – Amount of overlap in seconds. This amount is subtracted from a start-cutting-point, and added to a end-cutting-point.

A list of of: class: audiomate.annotations.LabelList.

Return type:



>>> ll = LabelList(labels=[
>>>     Label('a', 0, 5),
>>>     Label('b', 5, 10),
>>>     Label('c', 11, 15),
>>> res = ll.split([4.1, 8.9, 12.0])
>>> len(res)
>>> res[0].labels
[Label('a', 0.0, 4.1)]
>>> res[1].labels
    Label('a', 4.1, 5.0),
    Label('b', 5.0, 8.9)
>>> res[2].labels
    Label('b', 8.9, 10.0),
    Label('c', 11.0, 12.0)
>>> res[3].labels
[Label('c', 12.0, 15.0)]

If shift_times = True, the times are adjusted to be relative to the cutting-points for every label-list but the first.

>>> ll = LabelList(labels=[
>>>     Label('a', 0, 5),
>>>     Label('b', 5, 10),
>>> res = ll.split([4.6])
>>> len(res)
>>> res[0].labels
[Label('a', 0.0, 4.6)]
>>> res[1].labels
    Label('a', 0.0, 0.4),
    Label('b', 0.4, 5.4)

Return start of the earliest starting label (lower bound).

tokenized(delimiter=' ', overlap_threshold=0.1)[source]

Return a ordered list of tokens based on all labels. Joins all token from all labels (label.tokenized()`). If the overlapping between two labels is greater than overlap_threshold, an Exception is thrown.

  • delimiter (str) – The delimiter used to split labels into tokens. (default: space)
  • overlap_threshold (float) – Maximum overlap between two consecutive labels.

A list containing tokens of all labels ordered according to the label order.

Return type:



>>> ll = LabelList(idx='some', labels=[
>>>     Label('a d q', start=0, end=4),
>>>     Label('b', start=3.95, end=6.0),
>>>     Label('c a', start=7.0, end=10.2),
>>>     Label('f g', start=10.3, end=14.0)
>>> ])
>>> ll.tokenized(delimiter=' ', overlap_threshold=0.1)
['a', 'd', 'q', 'b', 'c', 'a', 'f', 'g']

Return the cumulative length of all labels. (Number of characters)


Add a list of labels to the end of the list.

Parameters:labels (list) – Labels to add.
classmethod with_label_values(values, idx='default')[source]

Create a new label-list containing labels with the given values. All labels will have default start/end values of 0 and inf.

  • values (list) – List of values(str) that should be created and appended to the label-list.
  • idx (str) – The idx of the label-list.

New label-list.

Return type:



>>> ll = LabelList.with_label_values(['a', 'x', 'z'], idx='letters')
>>> ll.idx
>>> ll.labels
    Label('a', 0, inf),
    Label('x', 0, inf),
    Label('z', 0, inf),


exception audiomate.annotations.relabeling.UnmappedLabelsException(message)[source]
audiomate.annotations.relabeling.find_missing_projections(label_list, projections)[source]

Finds all combinations of labels in label_list that are not covered by an entry in the dictionary of projections. Returns a list containing tuples of uncovered label combinations or en empty list if there are none. All uncovered label combinations are naturally sorted.

Each entry in the dictionary of projections represents a single projection that maps a combination of labels (key) to a single new label (value). The combination of labels to be mapped is a tuple of naturally sorted labels that apply to one or more segments simultaneously. By defining a special wildcard projection using (‘**’,) is is not required to specify a projection for every single combination of labels.

  • label_list (audiomate.annotations.LabelList) – The label list to relabel
  • projections (dict) – A dictionary that maps tuples of label combinations to string labels.

List of combinations of labels that are not covered by any projection

Return type:



>>> ll = annotations.LabelList(labels=[
...     annotations.Label('b', 3.2, 4.5),
...     annotations.Label('a', 4.0, 4.9),
...     annotations.Label('c', 4.2, 5.1)
... ])
>>> find_missing_projections(ll, {('b',): 'new_label'})
[('a', 'b'), ('a', 'b', 'c'), ('a', 'c'), ('c',)]

Loads projections defined in the given projections_file.

The projections_file is expected to be in the following format:

old_label_1 | new_label_1
old_label_1 old_label_2 | new_label_2
old_label_3 |

You can define one projection per line. Each projection starts with a list of one or multiple old labels (separated by a single whitespace) that are separated from the new label by a pipe (|). In the code above, the segment labeled with old_label_1 will be labeled with new_label_1 after applying the projection. Segments that are labeled with old_label_1 and old_label_2 concurrently are relabeled to new_label_2. All segments labeled with old_label_3 are dropped. Combinations of multiple labels are automatically sorted in natural order.

Parameters:projections_file (str) – Path to the file with projections
Returns:Dictionary where the keys are tuples of labels to project to the key’s value
Return type:dict


>>> load_projections('/path/to/projections.txt')
{('b',): 'foo', ('a', 'b'): 'a_b', ('a',): 'bar'}
audiomate.annotations.relabeling.relabel(label_list, projections)[source]

Relabel an entire LabelList using user-defined projections. Labels can be renamed, removed or overlapping labels can be flattened to a single label per segment.

Each entry in the dictionary of projections represents a single projection that maps a combination of labels (key) to a single new label (value). The combination of labels to be mapped is a tuple of naturally sorted labels that apply to one or more segments simultaneously. By defining a special wildcard projection using (‘**’,) is is not required to specify a projection for every single combination of labels.

This method raises a UnmappedLabelsException if a projection for one or more combinations of labels is not defined.

  • label_list (audiomate.annotations.LabelList) – The label list to relabel
  • projections (dict) – A dictionary that maps tuples of label combinations to string labels.

New label list with remapped labels

Return type:



UnmappedLabelsException – If a projection for one or more combinations of labels is not defined.


>>> projections = {
...     ('a',): 'a',
...     ('b',): 'b',
...     ('c',): 'c',
...     ('a', 'b',): 'a_b',
...     ('a', 'b', 'c',): 'a_b_c',
...     ('**',): 'b_c',
... }
>>> label_list = annotations.LabelList(labels=[
...     annotations.Label('a', 3.2, 4.5),
...     annotations.Label('b', 4.0, 4.9),
...     annotations.Label('c', 4.2, 5.1)
... ])
>>> ll = relabel(label_list, projections)
>>> [l.value for l in ll]
['a', 'a_b', 'a_b_c', 'b_c', 'c']


