Code

编译C文件到汇编语言

用 GCC 编译 .c 文件到汇编语言, 便于汇编语言的学习 ...

配置Jupyter Notebook默认工作路径

配置Jupyter Notebook默认工作路径 ...

解决Python脚本在运行环境下依赖库缺失的问题

解决脚本在某一个虚拟环境中完成后, 想要实际使用时可能会出现库依赖不存在的问题 ...

线性回归模型

线性回归是回归问题中的一种，线性回归假设目标值与特征之间线性相关，即满足一个多元一次方程。 ...

使用差分散列检测重复图片

关于散列(Hash), 网上的介绍有很多,这里就不费力介绍. 而对于这个具体的项目要求来说, 我们所做的只是需要把一个图像转换为一个Hash值, 然后储存到字典中. 并且有一下要求: 视觉上相差不大的图片, 他们的Hash值也应该相同这个Hash计算需要快, 因为有时候数据量会很大那么我们有以下几个Hash函数的选择: 差分散列(difference hashing) md5 sha-1 最终我们选择了差分散列的方法, 有以下的原因: 差分散列速度很快, 计算量小对于肉眼相差不大的图片, 差分散列可以得出相似的值 md5 和 sha-1 只要有一点变化, 输出值就会完全改变(这本来很好, 但在这里非常不好!) Detect and remove duplicate images from a dataset for deep learning 文章链接: https://www.pyimagesearch.com/2017/11/27/image-hashing-opencv-python/ https://www.pyimagesearch.com/2020/04/20/detect-and-remove-duplicate-images-from-a-dataset-for-deep-learning/?__s=bnfo5g8qgjr6gztmvjlb 前言: 为什么要删去数据集中重复的图片? Having duplicate images in your dataset creates a problem for two reasons: It introduces bias into your dataset, giving your deep neural network additional opportunities to learn patterns specific to the duplicates It hurts the ability of your model to generalize to new images outside of what it was trained on Take the time to remove duplicates from your image dataset so you don’t accidentally introduce bias or hurt the ability of your model to generalize....