怎么用一个整数来表示一个列表

发布时间：2021-10-21 11:38:35 来源：亿速云阅读：142 作者：iii 栏目：编程语言

这篇文章主要介绍“怎么用一个整数来表示一个列表”，在日常操作中，相信很多人在怎么用一个整数来表示一个列表问题上存在疑惑，小编查阅了各式资料，整理出简单好用的操作方法，希望对大家解答”怎么用一个整数来表示一个列表”的疑惑有所帮助！接下来，请跟着小编一起来学习吧！

概要

与 C、Rust 和 Go 不同，Python 默认的int 具有任意大小。[注1] 、[注2]

这意味着，一个整数可以存储无限大的值，只要内存足够。

例如，你可以打开 Python3 并运行以下命令：

>>> import math
>>> math.factorial(2020)
[number omitted]  # Python猫注：此处求2020的阶乘，结果是一长串数字，所以省略
>>> math.log2(math.factorial(2020))
19272.453841606068
>>> type(math.factorial(2020))
<class 'int'>

也就是说，在 Python 中，平常使用的 int 可以轻松地保存一个占用 19273 比特的 C 类型固定大小无符号 int 类型的值（C-style fixed-size unsigned int ）。在 Python 这样的语言中，便利性高于速度和内存效率，这确实很有用。

这种无限的精度，也意味着我们可以在单个 int 中存储任意数量的信息。只要编码正确，一整本书、一整个数据库、甚至任何东西，都可以被存入一个单独的 Python int 中。

(Python猫注：这有一篇文章，深度剖析了Python整型不会溢出的实现原理，可作关联阅读)

因此，我们可以设想出一种 Python 的方言，它只有整型，需要用 int 表示其它所有的类型（字典、列表、等等）。我们还有一些特殊的函数和方法，可以将 int 视为 list 、dict 等等。

这将会是一个有趣而好玩的练习，而这就是本文想要做的事。

有一个显而易见的实现方法：所有数据结构只是内存中的位数组（bit-arrays）。最坏的情况下，它是一组相关的位数组（例如，像链表或树中的每个节点），并且它们的集合也只是位数组。位数组可以被解释为二进制数。所以我们必然能这样做。但这有点无聊。

在本博文以及本系列的后续博文中，我将介绍一些用 int 来表示复杂数据结构的方法。它们不一定是最紧凑、最合理或最有效的，其共同的目标是找到这些数据结构的有趣的表示方式。[注3]

哥德尔数（Gödel numbering）简介

我们要表示的第一个数据结构是 list。我们将使用以逻辑学家 KurtGödel 命名的Gödel数。为了方便起见，我们仅处理由无符号整数（即自然数）组成的列表。

哥德尔数的原理是令每个大于 1 的自然数都用唯一的质数分解来表示。它依据的是算术的基本定理。

（Python猫注：质数分解，即 prime factorization，又译作质因数分解、素因子分解等，指的是把每个数都写成用质数相乘的形式）

看一些例子：

一个数字可以通过其质因子（prime factors ）的指数列表来唯一标识（直到其最高位的非零指数）。所以，我们可以用 126 来表示列表[1, 2, 0, 1] 。列表中的第一个数字是 126 作质数分解后 2 的指数，第二个数是 3 的指数，依此类推。

再来几个例子：

如果列表末尾有 0 ，该怎么办呢？好吧，基于这样的编码，不会出现这种情况。

在我们的质数分解中，指数为 0 的质数可能有无限个，因此我们需要停在某个地方。[注4] 我们选择在最后一个非零指数处停止。

当列表中包含较大的数字时，这种表示形式也会使用非常大的数字。那是因为列表中的数字表示的是指数，所以 int 的大小与它们成指数增长。例如，[50, 1000, 250] 需要使用大小为 2266 比特的数字表示。

另一方面，相比于其它用 int 编码的列表，那些包含非常多小整数的长列表，尤其是大型稀疏列表（即大部分的值都为 0），则拥有非常紧凑的表示形式。

提醒一下，将 list 编码为 int，这不是很好的编程实践，仅仅是一个好玩的实验。

Python实现

让我们看一下 Python 的实现。这里有几点注意事项：

我们会使用带有 yield 的函数，因为它极大地简化了操作。[注5]
你会看到大量的 while 循环。这是因为列表生成式、range 和大多数你打算在 for 循环中使用的东西，都被禁止用在只有 int 类型的方言中。所有这些都被 while 循环替代了。

质数生成器

我们要编写的第一个函数是一个迭代器，它将按顺序生成质数。它从头到尾都很关键。这里的实现是最简单可行的版本。

我可能很快会写一篇完整的关于生成质数的算法的文章，因为这是一个很酷的话题，本身也是一个古老的研究领域。最广为人知的算法是爱拉托逊斯筛法（Sieve of Erathosthenes ），但这只是冰山一角。[注6]

在这里，一个非常幼稚的实现就够了：

def primes(starting: int = 2):
    """Yield the primes in order.

    Args:
        starting: sets the minimum number to consider.

    Note: `starting` can be used to get all prime numbers
    _larger_ than some number. By default it doesn't skip
    any candidate primes.
    """
    candidate_prime = starting
    while True:
        candidate_factor = 2
        is_prime = True
        # We'll try all the numbers between 2 and
        # candidate_prime / 2. If any of them divide
        # our candidate_prime, then it's not a prime!
        while candidate_factor <= candidate_prime // 2:
            if candidate_prime % candidate_factor == 0:
                is_prime = False
                break
            candidate_factor += 1
        if is_prime:
            yield candidate_prime
        candidate_prime += 1

创建空列表

def empty_list() -> int:
    """Create a new empty list."""
    # 1 is the empty list. It isn't divisible by any prime.
    return 1

遍历元素

def iter_list(l: int):
    """Yields elements in the list, from first to last."""
    # We go through each prime in order. The next value of
    # the list is equal to the number of times the list is
    # divisible by the prime.
    for p in primes():
        # We decided we will have no trailing 0s, so when
        # the list is 1, it's over.
        if l <= 1:
            break
        # Count the number of divisions until the list is
        # not divisible by the prime number.
        num_divisions = 0
        while l % p == 0:
            num_divisions += 1
            l = l // p  # could be / as well
        yield num_divisions

访问元素

def access(l: int, i: int) -> int:
    """Return i-th element of l."""
    # First we iterate over all primes until we get to the
    # ith prime.
    j = 0
    for p in primes():
        if j == i:
            ith_prime = p
            break
        j += 1
    # Now we divide the list by the ith-prime until we
    # cant divide it no more.
    num_divisions = 0
    while l % ith_prime == 0:
        num_divisions += 1
        l = l // ith_prime
    return num_divisions

添加元素

def append(l: int, elem: int) -> int:
    # The first step is finding the largest prime factor.
    # We look at all primes until l.
    # The next prime after the last prime factor is going
    # to be the base we need to use to append.
    # E.g. if the list if 18 -> 2**1 * 3**2 -> [1, 2]
    # then the largest prime factor is 3, and we will
    # multiply by the _next_ prime factor to some power to
    # append to the list.
    last_prime_factor = 1  # Just a placeholder
    for p in primes():
        if p > l:
            break
        if l % p == 0:
            last_prime_factor = p
    # Now get the _next_ prime after the last in the list.
    for p in primes(starting=last_prime_factor + 1):
        next_prime = p
        break
    # Now finally we append an item by multiplying the list
    # by the next prime to the `elem` power.
    return l * next_prime ** elem

试用这些函数

你可以打开一个 Python、iPython 或 bPython会话，并试试这些函数！

建议列表元素使用从 1 到 10 之间的数字。如果使用比较大的数字，则 append 和 access 可能会花费很长时间。

从某种程度上说，使用哥德尔数来表示列表并不实用，尽管可以通过优化质数生成及分解算法，来极大地扩大可用数值的范围。

In [16]: l = empty_list()

In [17]: l = append(l, 2)

In [18]: l = append(l, 5)

In [19]: list(iter_list(l))
Out[19]: [2, 5]

In [20]: access(l, 0)
Out[20]: 2

In [21]: access(l, 1)
Out[21]: 5

In [22]: l
Out[22]: 972  # Python猫注：2^2*3^5=972

其它 int 编码

我们看到了一种将自然数列表表示为 int 的方法。还有其它更实用的方法，这些方法依赖于将数字的二进制形式细分为大小不一的块。我相信你可以提出这样的建议。

我以后可能会写其它文章，介绍更好的用于生成和分解质数的算法，以及其它复杂数据结构的 int 表示形式。

到此，关于“怎么用一个整数来表示一个列表”的学习就结束了，希望能够解决大家的疑惑。理论与实践的搭配能更好的帮助大家学习，快去试试吧！若想继续学习更多相关知识，请继续关注亿速云网站，小编会继续努力为大家带来更多实用的文章！

向AI问一下细节