#5 update

Merged
jiayu_neu merged 1 commits from lyb/mlp_and_nn:master into master 1 year ago
  1. +1134
    -0
      Tensor.ipynb

+ 1134
- 0
Tensor.ipynb View File

@@ -0,0 +1,1134 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "e76f8531",
"metadata": {},
"source": [
"## 1 张量 Tensor\n",
"\n",
"数据对于AI而言至关重要,为了能够完成各种数据操作,我们需要某种方法来存储和处理数据。通常,我们需要做两件重要的事:(1)获取数据;(2)将数据读入计算机后对其进行处理。如果没有某种方法来存储数据,那么获取数据是没有意义的。\n",
"\n",
"许多AI框架都定义了一种基本的数据结构——张量([Tensor](https://www.mindspore.cn/docs/zh-CN/r1.7/api_python/mindspore/mindspore.Tensor.html)),它与多维数组的概念非常相似。张量可以将向量和矩阵推广到任意维度,可用来表示在一些矢量、标量和其他张量之间的线性关系的多线性函数,这些线性关系的基本例子有内积、外积、线性映射以及笛卡儿积。\n",
"\n",
"起初在数学、物理和工程领域,张量这个术语会与空间、参考系统以及它们之间的转换的概念捆绑在一起。后来在AI领域,借鉴了张量的思想,定义了张量这一数据结构。张量的维度与用来表示张量中标量值的索引数量一致,下图形象地展示了张量的数据结构:\n",
"\n",
"<div align=center>\n",
" <img src=\"./img/tensor_1.jpg\" width=\"50%\" height=\"50%\" />\n",
"</div>\n",
"\n",
"<div align=center>\n",
" <img src=\"./img/tensor_2.jpg\" width=\"50%\" height=\"50%\" />\n",
"</div>\n",
"\n",
"张量也是MindSpore网络运算中的基本数据结构,它与Numpy中定义的array数据结构相似,但张量具有很多针对深度学习的功能,如支持各种算力芯片(包括CPU、GPU和NPU)和自动微分等。"
]
},
{
"cell_type": "markdown",
"id": "3fad6394",
"metadata": {},
"source": [
"### 1.1 MindSpore中张量类的格式\n",
"\n",
"`class mindspore.Tensor(input_data=None, dtype=None, shape=None, init=None, internal=False)`\n",
"\n",
"参数:\n",
"\n",
"+ **input_data** (Union [Tensor, float, int, bool, tuple, list, numpy.ndarray]) - 被存储的数据,可以是其它Tensor,也可以是Python基本数据(如int,float,bool等),或是一个NumPy对象。默认值:None。\n",
"\n",
"+ **dtype** (mindspore.dtype) - 用于定义该Tensor的数据类型,必须是 mindspore.dtype 中定义的类型。如果该参数为None,则数据类型与 input_data 一致,默认值:None。\n",
"\n",
"+ **shape** (Union [tuple, list, int]) - 用于定义该Tensor的形状。如果指定了 input_data ,则无需设置该参数。默认值:None。\n",
"\n",
"+ **init** (Initializer) - 用于在并行模式中延迟Tensor的数据的初始化,如果指定该参数,则 dtype 和 shape 也必须被指定。不推荐在非自动并行之外的场景下使用该接口。只有当调用 Tensor.init_data 时,才会使用指定的 init 来初始化Tensor数据。默认值:None。\n",
"\n",
"+ **internal** (bool) - Tensor是否由框架创建。 如果为True,表示Tensor是由框架创建的,如果为False,表示Tensor是由用户创建的。默认值:False。"
]
},
{
"cell_type": "markdown",
"id": "b70b9fd1",
"metadata": {},
"source": [
"### 1.2 创建张量\n",
"\n",
"张量的创建方式有多种,构造张量时,支持传入`Tensor`、`float`、`int`、`bool`、`tuple`、`list`和`numpy.ndarray`类型。\n",
"\n",
"+ **从Python原始数据直接生成张量**\n",
"\n",
"可以根据Python原始数据,如单个数据、列表等,数据类型可以设置或者通过框架自动推断。"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "7fed6672",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.1\n",
"[1 2 3]\n",
"[[1 1]\n",
" [2 2]\n",
" [3 3]]\n",
"<class 'mindspore.common.tensor.Tensor'>\n"
]
}
],
"source": [
"from mindspore import Tensor\n",
"\n",
"tensor_0 = Tensor(0.1)\n",
"tensor_1 = Tensor([1, 2, 3])\n",
"tensor_2 = Tensor([[1, 1], [2, 2], [3, 3]])\n",
"print(tensor_0)\n",
"print(tensor_1)\n",
"print(tensor_2)\n",
"print(type(tensor_0))"
]
},
{
"cell_type": "markdown",
"id": "99c2c8cd",
"metadata": {},
"source": [
"+ **从NumPy数组生成张量**"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "6be7d1e8",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'numpy.ndarray'>\n",
"<class 'mindspore.common.tensor.Tensor'>\n"
]
}
],
"source": [
"import numpy as np\n",
"\n",
"arr = np.array([1, 0, 1, 0])\n",
"tensor_arr = Tensor(arr)\n",
"print(type(arr))\n",
"print(type(tensor_arr))"
]
},
{
"cell_type": "markdown",
"id": "cffb4a51",
"metadata": {},
"source": [
"初始值的类型是`NumPy.array`,则生成的`Tensor`数据类型与之对应。"
]
},
{
"cell_type": "markdown",
"id": "c0f241fe",
"metadata": {},
"source": [
"+ **使用init初始化器生成张量**\n",
"\n",
"当使用`init`初始化器对张量进行初始化时,支持传入的参数有`init`、`shape`、`dtype`。\n",
"\n",
"+ `init`: 支持传入[initializer](https://www.mindspore.cn/docs/zh-CN/r1.7/api_python/mindspore.common.initializer.html)的子类。\n",
"\n",
"+ `shape`: 支持传入 `list`、`tuple`、 `int`。\n",
"\n",
"+ `dtype`: 支持传入[mindspore.dtype](https://www.mindspore.cn/docs/zh-CN/r1.7/api_python/mindspore/mindspore.dtype.html#mindspore.dtype)。"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "e9efcf1a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor1:\n",
" [[1. 1.]\n",
" [1. 1.]]\n",
"tensor2:\n",
" [[-0.00128023 -0.01392901]\n",
" [ 0.0130886 -0.00107818]]\n"
]
}
],
"source": [
"from mindspore import Tensor\n",
"from mindspore import set_seed\n",
"from mindspore import dtype as mstype\n",
"from mindspore.common.initializer import One, Normal\n",
"\n",
"set_seed(1)\n",
"\n",
"tensor1 = Tensor(shape=(2, 2), dtype=mstype.float32, init=One())\n",
"tensor2 = Tensor(shape=(2, 2), dtype=mstype.float32, init=Normal())\n",
"\n",
"print(\"tensor1:\\n\", tensor1)\n",
"print(\"tensor2:\\n\", tensor2)"
]
},
{
"cell_type": "markdown",
"id": "f7c634b8",
"metadata": {},
"source": [
"`init`主要用于并行模式下的延后初始化,在正常情况下不建议使用init对参数进行初始化。"
]
},
{
"cell_type": "markdown",
"id": "c326be92",
"metadata": {},
"source": [
"+ **继承另一个张量的属性,生成新的张量**"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "4ef33b9d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[1 1]\n",
" [1 1]]\n",
"input shape: (2, 2)\n",
"output shape: (2, 2)\n"
]
}
],
"source": [
"from mindspore import ops\n",
"\n",
"x = Tensor(np.array([[0, 1], [2, 1]]).astype(np.int32))\n",
"# oneslike = ops.OnesLike()\n",
"# output = oneslike(x)\n",
"output = ops.ones_like(x)\n",
"\n",
"print(output)\n",
"print(\"input shape:\", x.shape)\n",
"print(\"output shape:\", output.shape)"
]
},
{
"cell_type": "markdown",
"id": "126800c4",
"metadata": {},
"source": [
"+ **输出指定大小的恒定值张量**\n",
"\n",
"`shape`是张量的尺寸元组,确定输出的张量的维度。"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "938ab3ca",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[0. 0.]\n",
" [0. 0.]]\n"
]
}
],
"source": [
"shape = (2, 2)\n",
"ones = ops.Ones()\n",
"output = ones(shape, mstype.float32)\n",
"\n",
"zeros = ops.Zeros()\n",
"output = zeros(shape, mstype.float32)\n",
"print(output)"
]
},
{
"cell_type": "markdown",
"id": "20068f0f",
"metadata": {},
"source": [
"### 1.3 张量索引\n",
"\n",
"Tensor索引与Numpy索引类似,索引从0开始编制,负索引表示按倒序编制,冒号`:`和 `...`用于对数据进行切片操作。"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "b34c0bd1",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"First row: [0. 1.]\n",
"value of top right corner: 3.0\n",
"Last column: [1. 3.]\n",
"First column: [0. 2.]\n"
]
}
],
"source": [
"tensor = Tensor(np.array([[0, 1], [2, 3]]).astype(np.float32))\n",
"\n",
"print(\"First row: {}\".format(tensor[0]))\n",
"print(\"value of top right corner: {}\".format(tensor[1, 1]))\n",
"print(\"Last column: {}\".format(tensor[:, -1]))\n",
"print(\"First column: {}\".format(tensor[..., 0]))"
]
},
{
"cell_type": "markdown",
"id": "0f52281c",
"metadata": {},
"source": [
"`Tensor`初始化时,可指定dtype,如`mstype.int32`、`mstype.float32`、`mstype.bool`等。"
]
},
{
"cell_type": "markdown",
"id": "7694ee38",
"metadata": {},
"source": [
"### 1.4 张量的属性\n",
"\n",
"张量的属性包括形状、数据类型、转置张量、单个元素大小、占用字节数量、维数、元素个数和每一维步长。\n",
"\n",
"+ 形状(shape):`Tensor`的shape,是一个tuple。\n",
"\n",
"+ 数据类型(dtype):`Tensor`的dtype,是MindSpore的一个数据类型。\n",
"\n",
"+ 转置张量(T):`Tensor`的转置,是一个`Tensor`。\n",
"\n",
"+ 单个元素大小(itemsize): `Tensor`中每一个元素占用字节数,是一个整数。\n",
"\n",
"+ 占用字节数量(nbytes): `Tensor`占用的总字节数,是一个整数。\n",
"\n",
"+ 维数(ndim): `Tensor`的秩,也就是len(tensor.shape),是一个整数。\n",
"\n",
"+ 元素个数(size): `Tensor`中所有元素的个数,是一个整数。\n",
"\n",
"+ 每一维步长(strides): `Tensor`每一维所需要的字节数,是一个tuple。"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "68112de5",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"x_shape: (2, 2)\n",
"x_dtype: Int32\n",
"x_transposed:\n",
" [[1 3]\n",
" [2 4]]\n",
"x_itemsize: 4\n",
"x_nbytes: 16\n",
"x_ndim: 2\n",
"x_size: 4\n",
"x_strides: (8, 4)\n"
]
}
],
"source": [
"x = Tensor(np.array([[1, 2], [3, 4]]), mstype.int32)\n",
"\n",
"print(\"x_shape:\", x.shape)\n",
"print(\"x_dtype:\", x.dtype)\n",
"print(\"x_transposed:\\n\", x.T)\n",
"print(\"x_itemsize:\", x.itemsize)\n",
"print(\"x_nbytes:\", x.nbytes)\n",
"print(\"x_ndim:\", x.ndim)\n",
"print(\"x_size:\", x.size)\n",
"print(\"x_strides:\", x.strides)"
]
},
{
"cell_type": "markdown",
"id": "a44d883b",
"metadata": {},
"source": [
"### 1.5 张量运算\n",
"\n",
"张量之间有很多运算,包括算术、线性代数、矩阵处理(转置、标引、切片)、采样等,张量运算和NumPy的使用方式类似,下面介绍其中几种操作。\n",
"\n",
"> 普通算术运算有:加(+)、减(-)、乘(\\*)、除(/)、取模(%)、整除(//)。"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "0efabc53",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"add: [5. 7. 9.]\n",
"sub: [-3. -3. -3.]\n",
"mul: [ 4. 10. 18.]\n",
"div: [4. 2.5 2. ]\n",
"mod: [0. 1. 0.]\n",
"floordiv: [4. 2. 2.]\n"
]
}
],
"source": [
"x = Tensor(np.array([1, 2, 3]), mstype.float32)\n",
"y = Tensor(np.array([4, 5, 6]), mstype.float32)\n",
"\n",
"output_add = x + y\n",
"output_sub = x - y\n",
"output_mul = x * y\n",
"output_div = y / x\n",
"output_mod = y % x\n",
"output_floordiv = y // x\n",
"\n",
"print(\"add:\", output_add)\n",
"print(\"sub:\", output_sub)\n",
"print(\"mul:\", output_mul)\n",
"print(\"div:\", output_div)\n",
"print(\"mod:\", output_mod)\n",
"print(\"floordiv:\", output_floordiv)"
]
},
{
"cell_type": "markdown",
"id": "6c54ebc8",
"metadata": {},
"source": [
"`Concat`将给定维度上的一系列张量连接起来。"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "a024ab90",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[0. 1.]\n",
" [2. 3.]\n",
" [4. 5.]\n",
" [6. 7.]]\n",
"shape:\n",
" (4, 2)\n"
]
}
],
"source": [
"data1 = Tensor(np.array([[0, 1], [2, 3]]).astype(np.float32))\n",
"data2 = Tensor(np.array([[4, 5], [6, 7]]).astype(np.float32))\n",
"op = ops.Concat(axis=0)\n",
"output = op((data1, data2))\n",
"\n",
"print(output)\n",
"print(\"shape:\\n\", output.shape)"
]
},
{
"cell_type": "markdown",
"id": "39ffc0f4",
"metadata": {},
"source": [
"`Stack`则是从另一个维度上将两个张量合并起来。"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "be6e6117",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[[0. 1.]\n",
" [2. 3.]]\n",
"\n",
" [[4. 5.]\n",
" [6. 7.]]]\n",
"shape:\n",
" (2, 2, 2)\n"
]
}
],
"source": [
"data1 = Tensor(np.array([[0, 1], [2, 3]]).astype(np.float32))\n",
"data2 = Tensor(np.array([[4, 5], [6, 7]]).astype(np.float32))\n",
"op = ops.Stack(axis=0)\n",
"output = op([data1, data2])\n",
"\n",
"print(output)\n",
"print(\"shape:\\n\", output.shape)"
]
},
{
"cell_type": "markdown",
"id": "52dbd11f",
"metadata": {},
"source": [
"### 1.6 Tensor与NumPy转换\n",
"\n",
"Tensor可以和NumPy进行互相转换。\n",
"\n",
"#### 1.6.1 Tensor转换为NumPy\n",
"\n",
"与张量创建相同,使用 `asnumpy()` 将Tensor变量转换为NumPy变量。"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "7d996e86",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"output: <class 'mindspore.common.tensor.Tensor'>\n",
"n_output: <class 'numpy.ndarray'>\n"
]
}
],
"source": [
"zeros = ops.Zeros()\n",
"\n",
"output = zeros((2, 2), mstype.float32)\n",
"print(\"output: {}\".format(type(output)))\n",
"\n",
"n_output = output.asnumpy()\n",
"print(\"n_output: {}\".format(type(n_output)))"
]
},
{
"cell_type": "markdown",
"id": "16952dad",
"metadata": {},
"source": [
"#### 1.6.2 NumPy转换为Tensor\n",
"\n",
"使用`Tensor()`将NumPy变量转换为Tensor变量。"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "c3ffaa82",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"output: <class 'numpy.ndarray'>\n",
"t_output: <class 'mindspore.common.tensor.Tensor'>\n"
]
}
],
"source": [
"output = np.array([1, 0, 1, 0])\n",
"print(\"output: {}\".format(type(output)))\n",
"\n",
"t_output = Tensor(output)\n",
"print(\"t_output: {}\".format(type(t_output)))"
]
},
{
"cell_type": "markdown",
"id": "77363cf0",
"metadata": {},
"source": [
"### 1.7 张量的API\n",
"\n",
"MindSpore张量操作更多的用法请参考[MindSpore张量API地址](https://www.mindspore.cn/docs/zh-CN/r1.7/api_python/mindspore/mindspore.Tensor.html#mindspore.Tensor)"
]
},
{
"cell_type": "markdown",
"id": "ddd56f26",
"metadata": {},
"source": [
"### 1.8 张量的存储\n",
"\n",
"<div align=center>\n",
" <img src = \"./img/tensor_storage_1.jpg\" width=\"30%\" height=\"50%\" />\n",
"</div> \n",
"\n",
"<div align=center>\n",
" <img src = \"./img/tensor_storage_2.jpg\" width=\"30%\" height=\"50%\" />\n",
"</div>\n",
"\n",
"<div align=center>\n",
" <img src = \"./img/tensor_storage_3.jpg\" width=\"30%\" height=\"50%\" />\n",
"</div>"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "3305ef8d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"True\n",
"False\n"
]
},
{
"data": {
"text/plain": [
"(Tensor(shape=[3, 2], dtype=Int32, value=\n",
" [[7, 1],\n",
" [5, 3],\n",
" [2, 1]]),\n",
" Tensor(shape=[3, 2], dtype=Int32, value=\n",
" [[7, 1],\n",
" [5, 3],\n",
" [2, 1]]),\n",
" Tensor(shape=[3, 2], dtype=Int32, value=\n",
" [[4, 1],\n",
" [5, 3],\n",
" [2, 1]]))"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from mindspore import Tensor\n",
"a = np.array([[4, 1], [5, 3], [2, 1]])\n",
"tensor_a = Tensor(a)\n",
"tensor_c = tensor_a.copy()\n",
"tensor_b = tensor_a[:]\n",
"tensor_b[0, 0] = 7\n",
"print(tensor_a[0, 0] == tensor_b[0, 0]) # tensor_a和tensor_b共用一块内存\n",
"print(tensor_c[0, 0] == tensor_b[0, 0]) # tensor_c单独开辟一块内存\n",
"tensor_a, tensor_b, tensor_c"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "8a577ee3",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(3, 2)\n",
"4\n",
"(8, 4)\n"
]
}
],
"source": [
"print(tensor_c.shape) # 形状\n",
"print(tensor_c.itemsize) # 偏移量\n",
"print(tensor_c.strides) # 步长"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "303be9ba",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(8, 4)\n"
]
}
],
"source": [
"tensor_d = tensor_c.T\n",
"print(tensor_c.strides)"
]
},
{
"cell_type": "markdown",
"id": "a3bf596b",
"metadata": {},
"source": [
"### 1.9 张量序列化和反序列化\n",
"\n",
"+ 序列化就是指把对象转换为字节序列的过程;\n",
"\n",
"反序列化就是指把字节序列恢复为对象的过程。\n",
"\n",
"+ 序列化最重要的作用:在传递和保存对象时,保证对象的完整性和可传递性。对象转换为有序字节流,以便在网络上传输或者保存在本地文件中;\n",
"\n",
"反序列化的最重要的作用:根据字节流中保存的对象状态及描述信息,通过反序列化重建对象。\n",
"\n",
"+ 总结:核心作用就是对象状态的保存和重建。(整个过程核心点就是字节流中所保存的对象状态及描述信息)\n",
"\n",
"在有需要的情况下,可以使用 HDF5 格式和 h5py 库。 HDF5 是一种可移植的、被广泛支持的格式,用于将序列化的多维数组组织在一个嵌套的键值对字典中。Python 通过 h5py 库支持 HDF5,该库接收和返回 NumPy 数组格式的数据。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8b245211",
"metadata": {},
"outputs": [],
"source": [
"!pip install h5py # 安装h5py包"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "7824722c",
"metadata": {},
"outputs": [
{
"ename": "ModuleNotFoundError",
"evalue": "No module named 'h5py'",
"output_type": "error",
"traceback": [
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[1;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m~\\AppData\\Local\\Temp\\ipykernel_10544\\542816807.py\u001b[0m in \u001b[0;36m<module>\u001b[1;34m\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[1;31m# 利用h5py序列化\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 2\u001b[1;33m \u001b[1;32mimport\u001b[0m \u001b[0mh5py\u001b[0m \u001b[1;32mas\u001b[0m \u001b[0mh5\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 3\u001b[0m \u001b[1;32mfrom\u001b[0m \u001b[0mmindspore\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mTensor\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 4\u001b[0m \u001b[1;32mfrom\u001b[0m \u001b[0mmindspore\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mnumpy\u001b[0m \u001b[1;32mas\u001b[0m \u001b[0mnp\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 5\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;31mModuleNotFoundError\u001b[0m: No module named 'h5py'"
]
}
],
"source": [
"# 利用h5py序列化\n",
"import h5py as h5\n",
"from mindspore import Tensor\n",
"from mindspore import numpy as np\n",
"\n",
"tensor = Tensor(np.reshape(np.arange(16), (4, 4)))\n",
"filepath = './data/tensorfile/tensor.hdf5'\n",
"f = h5.File(filepath, 'w')\n",
"# 这里的“matrix”是保存到 HDF5 文件的一个键\n",
"dset = f.create_dataset('matrix', data=tensor.asnumpy())\n",
"f.close()"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "7261dbac",
"metadata": {},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'h5' is not defined",
"output_type": "error",
"traceback": [
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[1;31mNameError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m~\\AppData\\Local\\Temp\\ipykernel_10544\\3666773833.py\u001b[0m in \u001b[0;36m<module>\u001b[1;34m\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[1;31m# 反序列化\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 2\u001b[1;33m \u001b[0mf\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mh5\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mFile\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfilepath\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m'r'\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 3\u001b[0m \u001b[0mdset\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mf\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;34m'matrix'\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 4\u001b[0m \u001b[0mtensor_\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mdset\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 5\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mtype\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mtensor_\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;31mNameError\u001b[0m: name 'h5' is not defined"
]
}
],
"source": [
"# 反序列化\n",
"f = h5.File(filepath, 'r')\n",
"dset = f['matrix']\n",
"tensor_ = dset[:]\n",
"print(type(tensor_))\n",
"tensor_r = Tensor(tensor_)\n",
"tensor_r"
]
},
{
"cell_type": "markdown",
"id": "19a8be8e",
"metadata": {},
"source": [
"### 1.10 张量实验任务\n",
"\n",
"运用上面所学的知识完成以下四个实验任务:\n",
"\n",
"1. 创建一个形状为3×4的二维张量$W$和一个长度为4的一维张量$X$,数据类型都为32位浮点型;\n",
"\n",
"2. 查看两个张量的属性;\n",
"\n",
"3. 将$W$的进行拆分,第一列元素为$b$,其余形状为3×3的元素为$w$,将张量$X$的第一个元素设置为1,并取其余元素为$x$;\n",
"\n",
"4. 计算$W*X^{T}$和$w*x^{T}+b$,并将结果转换为numpy.array格式(会用到[expand_dims](https://www.mindspore.cn/docs/zh-CN/r1.7/api_python/mindspore/mindspore.Tensor.html#mindspore.Tensor.expand_dims)和[ops.matmul](https://www.mindspore.cn/docs/zh-CN/r1.7/api_python/ops/mindspore.ops.matmul.html#mindspore.ops.matmul)两种方法,可以参考MindSpore API)。"
]
},
{
"cell_type": "markdown",
"id": "bfdf3259",
"metadata": {},
"source": [
"### 参考答案\n",
"\n",
"1. 创建$W=\\begin{bmatrix} 1&1&1&1 \\\\ 2&2&2&2 \\\\ 3&3&3&3 \\end{bmatrix}$,$X=\\begin{bmatrix} 2&3&4&5 \\end{bmatrix}$"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "d58db64e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[1. 1. 1. 1.]\n",
" [2. 2. 2. 2.]\n",
" [3. 3. 3. 3.]]\n",
"[2. 3. 4. 5.]\n"
]
}
],
"source": [
"import mindspore as ms\n",
"from mindspore import Tensor\n",
"import numpy as np\n",
"\n",
"# 利用Numpy创建Tensor\n",
"W_ = np.resize(np.arange(1, 4), (4, 3)).T\n",
"X_ = np.arange(2, 6)\n",
"W = Tensor(W_, dtype=ms.float32)\n",
"X = Tensor(X_, dtype=ms.float32)\n",
"\n",
"print(W)\n",
"print(X)"
]
},
{
"cell_type": "markdown",
"id": "47f1d373",
"metadata": {},
"source": [
"2. 查看两个张量的属性,包括`形状(shape)`、`数据类型(dtype)`、`转置张量(T)`、`单个元素大小(itemsize)`、`占用字节数量(nbytes)`、`维数(ndim)` 、`元素个数(size)`、`每一维步长(strides)`"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "e20f23f5",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Properties of X:\n",
"\tshape: (4,)\n",
"\tdtpye: Float32\n",
"\ttranspose: [2. 3. 4. 5.]\n",
"\titemsize: 4\n",
"\tnbytes: 16\n",
"\tndim: 1\n",
"\tsize: 4\n",
"\tstrides: (4,)\n",
"Properties of W:\n",
"\n",
"\tshape: (3, 4)\n",
"\tdtpye: Float32\n",
"\ttranspose:\n",
" [[1. 2. 3.]\n",
" [1. 2. 3.]\n",
" [1. 2. 3.]\n",
" [1. 2. 3.]]\n",
"\titemsize: 4\n",
"\tnbytes: 48\n",
"\tndim: 2\n",
"\tsize: 12\n",
"\tstrides: (16, 4)\n"
]
}
],
"source": [
"print(\"Properties of X:\")\n",
"print(\"\\tshape:\", X.shape)\n",
"print(\"\\tdtpye:\", X.dtype)\n",
"print(\"\\ttranspose:\", X.T)\n",
"print(\"\\titemsize:\", X.itemsize)\n",
"print(\"\\tnbytes:\", X.nbytes)\n",
"print(\"\\tndim:\", X.ndim)\n",
"print(\"\\tsize:\", X.size)\n",
"print(\"\\tstrides:\", X.strides)\n",
"print(\"Properties of W:\\n\")\n",
"print(\"\\tshape:\", W.shape)\n",
"print(\"\\tdtpye:\", W.dtype)\n",
"print(\"\\ttranspose:\\n\", W.T)\n",
"print(\"\\titemsize:\", W.itemsize)\n",
"print(\"\\tnbytes:\", W.nbytes)\n",
"print(\"\\tndim:\", W.ndim)\n",
"print(\"\\tsize:\", W.size)\n",
"print(\"\\tstrides:\", W.strides)"
]
},
{
"cell_type": "markdown",
"id": "ff06302c",
"metadata": {},
"source": [
"3.\n",
"\n",
"$$W=\\begin{bmatrix} 1&1&1&1 \\\\ 2&2&2&2 \\\\ 3&3&3&3 \\end{bmatrix} \\rightarrow w=\\begin{bmatrix} 1&1&1 \\\\ 2&2&2 \\\\ 3&3&3 \\end{bmatrix},b=\\begin{bmatrix} 1 \\\\ 2 \\\\ 3 \\end{bmatrix}$$\n",
"\n",
"$$X=\\begin{bmatrix} 2&3&4&5 \\end{bmatrix}\\rightarrow X=\\begin{bmatrix} 1&3&4&5 \\end{bmatrix}, x = \\begin{bmatrix} 3&4&5 \\end{bmatrix}$$"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "966238d2",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"w: [[1. 1. 1.]\n",
" [2. 2. 2.]\n",
" [3. 3. 3.]]\n",
"b: [1. 2. 3.]\n",
"X: [1. 3. 4. 5.]\n",
"x: [3. 4. 5.]\n"
]
}
],
"source": [
"# 通过张量索引实现\n",
"w = W[:, 1:4]\n",
"b = W[:, 0]\n",
"X[0] = 1\n",
"x = X[1:4]\n",
"print(\"w:\", w)\n",
"print(\"b:\", b)\n",
"print(\"X:\", X)\n",
"print(\"x:\", x)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "d8b47b83",
"metadata": {},
"outputs": [
{
"ename": "SyntaxError",
"evalue": "invalid syntax (3150573548.py, line 3)",
"output_type": "error",
"traceback": [
"\u001b[1;36m File \u001b[1;32m\"C:\\Users\\27793\\AppData\\Local\\Temp\\ipykernel_10544\\3150573548.py\"\u001b[1;36m, line \u001b[1;32m3\u001b[0m\n\u001b[1;33m $$F=WX^{T}=\\begin{bmatrix} 1&1&1&1 \\\\ 2&2&2&2 \\\\ 3&3&3&3 \\end{bmatrix}\\begin{bmatrix} 1\\\\ 3\\\\ 4\\\\ 5 \\end{bmatrix}$$\u001b[0m\n\u001b[1;37m ^\u001b[0m\n\u001b[1;31mSyntaxError\u001b[0m\u001b[1;31m:\u001b[0m invalid syntax\n"
]
}
],
"source": [
"4.\n",
"\n",
"$$F=WX^{T}=\\begin{bmatrix} 1&1&1&1 \\\\ 2&2&2&2 \\\\ 3&3&3&3 \\end{bmatrix}\\begin{bmatrix} 1\\\\ 3\\\\ 4\\\\ 5 \\end{bmatrix}$$\n",
"\n",
"$$f=wx^{T}+b=\\begin{bmatrix} 1&1&1 \\\\ 2&2&2 \\\\ 3&3&3 \\end{bmatrix}\\begin{bmatrix} 3\\\\ 4\\\\ 5 \\end{bmatrix}+\\begin{bmatrix} 1 \\\\ 2 \\\\ 3 \\end{bmatrix}$$"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "9d587b1b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(1, 4)\n",
"[[13.]\n",
" [26.]\n",
" [39.]]\n"
]
}
],
"source": [
"from mindspore import ops\n",
"\n",
"\n",
"# 将X扩展为(1,4)\n",
"X_ = X.expand_dims(axis=0)\n",
"print(X_.shape)\n",
"\n",
"F = ops.matmul(W, X_.T)\n",
"print(F)"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "84fd40b3",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(1, 3)\n",
"(1, 3)\n",
"[[13.]\n",
" [26.]\n",
" [39.]]\n"
]
}
],
"source": [
"from mindspore import ops\n",
"\n",
"\n",
"# 将x扩展为(1,3)\n",
"x_ = x.expand_dims(axis=0)\n",
"b_ = b.expand_dims(axis=0)\n",
"print(b_.shape)\n",
"print(x_.shape)\n",
"f = ops.matmul(w, x_.T)+b_.T\n",
"print(f)"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "a5313db5",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Tensor(shape=[3, 1], dtype=Bool, value=\n",
"[[ True],\n",
" [ True],\n",
" [ True]])"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"F == f"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "ms3",
"language": "python",
"name": "ms3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.5"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
"autoclose": false,
"autocomplete": true,
"bibliofile": "biblio.bib",
"cite_by": "apalike",
"current_citInitial": 1,
"eqLabelWithNumbers": true,
"eqNumInitial": 1,
"hotkeys": {
"equation": "Ctrl-E",
"itemize": "Ctrl-I"
},
"labels_anchors": false,
"latex_user_defs": false,
"report_style_numbering": false,
"user_envs_cfg": false
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": false,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": true
},
"varInspector": {
"cols": {
"lenName": 16,
"lenType": 16,
"lenVar": 40
},
"kernels_config": {
"python": {
"delete_cmd_postfix": "",
"delete_cmd_prefix": "del ",
"library": "var_list.py",
"varRefreshCmd": "print(var_dic_list())"
},
"r": {
"delete_cmd_postfix": ") ",
"delete_cmd_prefix": "rm(",
"library": "var_list.r",
"varRefreshCmd": "cat(var_dic_list()) "
}
},
"types_to_exclude": [
"module",
"function",
"builtin_function_or_method",
"instance",
"_Feature"
],
"window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 5
}

Loading…
Cancel
Save