博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
TensorFlow-单层神经网络
阅读量:6002 次
发布时间:2019-06-20

本文共 3644 字,大约阅读时间需要 12 分钟。

#!/usr/bin/env python2

-- coding: utf-8 --

"""

Created on Mon Jul 10 09:35:04 2017

@author: myhaspl@myhaspl.com

"""
#逻辑或
import tensorflow as tf

batch_size=10

w1=tf.Variable(tf.random_normal([2,6],stddev=1,seed=1))
w2=tf.Variable(tf.random_normal([6,1],stddev=1,seed=1))
b=tf.Variable(tf.zeros([6]),tf.float32)

x=tf.placeholder(tf.float32,shape=(None,2),name="x")

y=tf.placeholder(tf.float32,shape=(None,1),name="y")

h=tf.matmul(x,w1)+b

yo=tf.matmul(h,w2)

#损失函数计算差异平均值

cross_entropy=tf.reduce_mean(tf.abs(y-yo))
#反向传播
train_step=tf.train.AdamOptimizer(0.05).minimize(cross_entropy)

#生成样本

x=[[0.,0.],[0.,1.],[1.,0.],[1.,1.]]

y=[[0.],[1.],[1.],[1.]]
b_=tf.zeros([6])

with tf.Session() as sess:

#初始化变量
init_op=tf.global_variables_initializer()
sess.run(init_op)
print sess.run(w1)
print sess.run(w2)

#设定训练轮数TRAINCOUNT=500for i in range(TRAINCOUNT):    #开始训练    sess.run(train_step,feed_dict={x:x_,y:y_})    if i%10==0:        total_cross_entropy=sess.run(cross_entropy,feed_dict={x:x_,y:y_})        print("%d 次训练之后,损失:%g"%(i+1,total_cross_entropy))print(sess.run(w1))print(sess.run(w2))#生成测试样本,仅进行前向传播验证:testyo=sess.run(yo,feed_dict={x:[[0.,1.],[1.,1.]]})myout=[int(testout>0.5) for testout in testyo]print myout

两个概率分布p,q,其中p为真实分布,q为非真实分布

class tf.train.GradientDescentOptimizer

See the guide: Training > Optimizers

Optimizer that implements the gradient descent algorithm.

Methods

init(learning_rate, use_locking=False, name='GradientDescent')
Construct a new gradient descent optimizer.

Args:

learning_rate: A Tensor or a floating point value. The learning rate to use.

use_locking: If True use locks for update operations.
name: Optional name prefix for the operations created when applying gradients. Defaults to "GradientDescent".

Adam 

Adam(Adaptive Moment Estimation)本质上是带有动量项的RMSprop,它利用梯度的一阶矩估计和二阶矩估计动态调整每个参数的学习率。Adam的优点主要在于经过偏置校正后,每一次迭代学习率都有个确定范围,使得参数比较平稳。

class tf.train.AdamOptimizer

Defined in tensorflow/python/training/adam.py.

See the guide: Training > Optimizers

Optimizer that implements the Adam algorithm.

See Kingma et. al., 2014 (pdf).

Methods

init

 

init(

    learning_rate=0.001,
    beta1=0.9,
    beta2=0.999,
    epsilon=1e-08,
    use_locking=False,
    name='Adam'
)

Construct a new Adam optimizer.

Initialization:

 

m_0 <- 0 (Initialize initial 1st moment vector)

v_0 <- 0 (Initialize initial 2nd moment vector)
t <- 0 (Initialize timestep)

The update rule for variable with gradient g uses an optimization described at the end of section2 of the paper:

 

t <- t + 1

lr_t <- learning_rate * sqrt(1 - beta2^t) / (1 - beta1^t)

mt <- beta1 * m{t-1} + (1 - beta1) g

v_t <- beta2 v_{t-1} + (1 - beta2) g g
variable <- variable - lr_t * m_t / (sqrt(v_t) + epsilon)

The default value of 1e-8 for epsilon might not be a good default in general. For example, when training an Inception network on ImageNet a current good choice is 1.0 or 0.1. Note that since AdamOptimizer uses the formulation just before Section 2.1 of the Kingma and Ba paper rather than the formulation in Algorithm 1, the "epsilon" referred to here is "epsilon hat" in the paper.

The sparse implementation of this algorithm (used when the gradient is an IndexedSlices object, typically because of tf.gather or an embedding lookup in the forward pass) does apply momentum to variable slices even if they were not used in the forward pass (meaning they have a gradient equal to zero). Momentum decay (beta1) is also applied to the entire momentum accumulator. This means that the sparse behavior is equivalent to the dense behavior (in contrast to some momentum implementations which ignore momentum unless a variable slice was actually used).

转载于:https://blog.51cto.com/13959448/2332334

你可能感兴趣的文章
《vSphere性能设计:性能密集场景下CPU、内存、存储及网络的最佳设计实践》一1.1.3 评估物理性能...
查看>>
良品铺子天猫618爆卖300万个手撕面包,还用数据改造线下
查看>>
这位阿里工程师的家,为何设置了重重机关?
查看>>
Linux平台Swift语言开发学习环境搭建
查看>>
Facebook 开源一些关于深度学习的工具
查看>>
《MATLAB神经网络超级学习手册》——1.3 MATLAB R2013a的安装
查看>>
开源视频平台 Kaltura 获高盛5000万美元融资拟IPO
查看>>
Windows 8.1 今年 1 月市场份额超 Vista
查看>>
《设计团队协作权威指南》—第1章1.5节总结
查看>>
【PMP认证考试之个人总结】第 5 章 项目时间管理
查看>>
Chair:支付宝前端团队推出的Node.js Web框架
查看>>
port-forward v1.0.1 发布,端口转发工具
查看>>
《Total Commander:万能文件管理器》——第3.8节.后续更新
查看>>
《Windows Server 2012活动目录管理实践》——2.2 部署网络第一台域控制器
查看>>
look: Linux 下验证拼写并显示以某字符串开头的行的命令
查看>>
AKKA文档(java)——术语,概念
查看>>
《多核与GPU编程:工具、方法及实践》----第2章 多核和并行程序设计 2.1 引言...
查看>>
《信息存储与管理(第二版):数字信息的存储、管理和保护》—— 2.2 数据库管理系统(DBMS)...
查看>>
《众妙之门——移动交互体验设计》一第1章 未来的移动技术1.1 人人实现互联...
查看>>
《应用程序性能测试的艺术(第2版)》—第2章 2.1节性能测试工具架构
查看>>